[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/3] vpci: allow queueing of mapping operations



On Fri, Aug 01, 2025 at 05:06:32PM -0400, Stewart Hildebrand wrote:
> On 7/25/25 03:58, Roger Pau Monné wrote:
> > On Thu, Jul 24, 2025 at 06:44:32PM +0200, Roger Pau Monné wrote:
> >> On Wed, Jul 23, 2025 at 12:37:41PM -0400, Stewart Hildebrand wrote:
> >>> @@ -283,7 +297,48 @@ static int __init apply_map(struct domain *d, const 
> >>> struct pci_dev *pdev,
> >>>      return rc;
> >>>  }
> >>>  
> >>> -static void defer_map(const struct pci_dev *pdev, uint16_t cmd, bool 
> >>> rom_only)
> >>> +static struct vpci_map_task *alloc_map_task(const struct pci_dev *pdev,
> >>> +                                            uint16_t cmd, bool rom_only)
> >>> +{
> >>> +    struct vpci_map_task *task = xzalloc(struct vpci_map_task);
> >>
> >> xvzalloc() preferably.
> >>
> >> This however introduces run-time allocations as a result of guest
> >> actions, which is not ideal IMO.  It would be preferable to do those
> >> allocations as part of the header initialization, and re-use them.
> > 
> > I've been thinking over this, as I've realized that while commenting
> > on it, I didn't provide any alternatives.
> > 
> > The usage of rangesets to figure out the regions to map is already not
> > optimal, as adding/removing from a rangeset can lead to memory
> > allocations.  It would be good if we could create rangesets with a
> > pre-allocated number of ranges (iow: a pool of struct ranges), but
> > that's for another patchset.  I think Jan already commented on this
> > aspect long time ago.
> 
> +1
> 
> > I'm considering whether to allocate the deferred mapping structures
> > per-vCPU instead of per-device.  That would for example mean moving
> > the current vpci_bar->mem rangeset so it's allocated in vpci_vcpu
> > struct instead.  The point would be to not have the rangesets per
> > device (because there can be a lot of devices, specially for the
> > hardware domain), but instead have those per-vCPU.  This should work
> > because a vCPU can only queue a single vPCI operation, from a single
> > device.
> > 
> > It should then be possible to allocate the deferred mapping structures
> > at vCPU creation.  I also ponder if we really need a linked list to
> > queue them; AFAIK there can only ever be an unmapping and a mapping
> > operation pending (so 2 operations at most).  Hence we could use a
> > more "fixed" structure like an array.  For example in struct vpci_vcpu
> > you could introduce a struct vpci_map_task task[2] field?
> > 
> > Sorry, I know this is not a minor change to request.  It shouldn't
> > change the overall logic much, but it would inevitably affect the
> > code.  Let me know what you think.
> 
> Thanks for the feedback and suggestion. Yeah, I'll give this a try.
> Here's roughly what I'm thinking so far. I'll keep playing with it.
> 
> diff --git a/xen/common/domain.c b/xen/common/domain.c
> index 5241a1629eeb..942c9fe7d364 100644
> --- a/xen/common/domain.c
> +++ b/xen/common/domain.c
> @@ -387,6 +387,16 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
>   */
>  static int vcpu_teardown(struct vcpu *v)
>  {
> +#ifdef CONFIG_HAS_VPCI
> +    for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ )
> +    {
> +        struct vpci_map_task *task = &v->vpci.task[i];
> +
> +        for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ )
> +            rangeset_destroy(task->bars[j].mem);

You might want to additionally do:

task->bars[j].mem = NULL;

> +    }
> +#endif
> +
>      vmtrace_free_buffer(v);
>  
>      return 0;
> @@ -467,6 +477,26 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int 
> vcpu_id)
>          d->vcpu[prev_id]->next_in_list = v;
>      }
>  
> +#ifdef CONFIG_HAS_VPCI
> +    for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ )
> +    {
> +        struct vpci_map_task *task = &v->vpci.task[i];
> +
> +        for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ )
> +        {
> +            struct vpci_bar_map *bar = &task->bars[j];
> +            char str[32];
> +
> +            snprintf(str, sizeof(str), "PCI map vcpu %u task %u BAR %u", 
> vcpu_id, i, j);
> +
> +            bar->mem = rangeset_new(v->domain, str, RANGESETF_no_print);

Not sure there's much point in naming those with that much detail -
those are scratch space for mapping calculations.  You already pass
RANGESETF_no_print, which means the contents of the rangeset won't be
dumped, and hence the name is kind of meaningless.  I shouldn't have
named those either when allocated in bar_add_rangeset().

> +
> +            if ( !bar->mem )
> +                goto fail_sched;
> +        }
> +    }
> +#endif
> +
>      /* Must be called after making new vcpu visible to for_each_vcpu(). */
>      vcpu_check_shutdown(v);
>  
> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
> index 17cfecb0aabf..afe78b00ffc9 100644
> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -116,7 +116,6 @@ struct vpci {
>              uint64_t guest_addr;
>              uint64_t size;
>              uint64_t resizable_sizes;
> -            struct rangeset *mem;
>              enum {
>                  VPCI_BAR_EMPTY,
>                  VPCI_BAR_IO,
> @@ -207,14 +206,23 @@ struct vpci {
>  #endif
>  };
>  
> +#ifdef __XEN__
>  struct vpci_vcpu {
>      /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */
>      const struct pci_dev *pdev;
> -    uint16_t cmd;
> -    bool rom_only : 1;
> +    struct domain *domain;
> +    unsigned int nr_pending_ops;

Not sure you really need a pending ops counter?  Hard to tell without
seeing the code that makes use of it.

> +    struct vpci_map_task {
> +        struct vpci_bar_map {
> +            uint64_t addr;
> +            uint64_t guest_addr;
> +            struct rangeset *mem;
> +        } bars[PCI_HEADER_NORMAL_NR_BARS + 1];
> +        uint16_t cmd;
> +        bool rom_only : 1;
> +    } task[2];

Don't you need a way to differentiate between map/unmap operations?
Do you plan to use slot 0 as unmap and slot 1 as map?  Or would you
rather introduce a boolean field to signal it in struct vpci_map_task?

Overall seems OK, but obviously it all needs to fit together with the
current code :).

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.