Xen project Mailing List

Re: [PATCH v2 1/3] vpci: allow queueing of mapping operations

To: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Mon, 4 Aug 2025 15:55:54 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>

Delivery-date: Mon, 04 Aug 2025 13:55:58 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Aug 01, 2025 at 05:06:32PM -0400, Stewart Hildebrand wrote: > On 7/25/25 03:58, Roger Pau Monné wrote: > > On Thu, Jul 24, 2025 at 06:44:32PM +0200, Roger Pau Monné wrote: > >> On Wed, Jul 23, 2025 at 12:37:41PM -0400, Stewart Hildebrand wrote: > >>> @@ -283,7 +297,48 @@ static int __init apply_map(struct domain *d, const > >>> struct pci_dev *pdev, > >>> return rc; > >>> } > >>> > >>> -static void defer_map(const struct pci_dev *pdev, uint16_t cmd, bool > >>> rom_only) > >>> +static struct vpci_map_task *alloc_map_task(const struct pci_dev *pdev, > >>> + uint16_t cmd, bool rom_only) > >>> +{ > >>> + struct vpci_map_task *task = xzalloc(struct vpci_map_task); > >> > >> xvzalloc() preferably. > >> > >> This however introduces run-time allocations as a result of guest > >> actions, which is not ideal IMO. It would be preferable to do those > >> allocations as part of the header initialization, and re-use them. > > > > I've been thinking over this, as I've realized that while commenting > > on it, I didn't provide any alternatives. > > > > The usage of rangesets to figure out the regions to map is already not > > optimal, as adding/removing from a rangeset can lead to memory > > allocations. It would be good if we could create rangesets with a > > pre-allocated number of ranges (iow: a pool of struct ranges), but > > that's for another patchset. I think Jan already commented on this > > aspect long time ago. > > +1 > > > I'm considering whether to allocate the deferred mapping structures > > per-vCPU instead of per-device. That would for example mean moving > > the current vpci_bar->mem rangeset so it's allocated in vpci_vcpu > > struct instead. The point would be to not have the rangesets per > > device (because there can be a lot of devices, specially for the > > hardware domain), but instead have those per-vCPU. This should work > > because a vCPU can only queue a single vPCI operation, from a single > > device. > > > > It should then be possible to allocate the deferred mapping structures > > at vCPU creation. I also ponder if we really need a linked list to > > queue them; AFAIK there can only ever be an unmapping and a mapping > > operation pending (so 2 operations at most). Hence we could use a > > more "fixed" structure like an array. For example in struct vpci_vcpu > > you could introduce a struct vpci_map_task task[2] field? > > > > Sorry, I know this is not a minor change to request. It shouldn't > > change the overall logic much, but it would inevitably affect the > > code. Let me know what you think. > > Thanks for the feedback and suggestion. Yeah, I'll give this a try. > Here's roughly what I'm thinking so far. I'll keep playing with it. > > diff --git a/xen/common/domain.c b/xen/common/domain.c > index 5241a1629eeb..942c9fe7d364 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -387,6 +387,16 @@ static int vmtrace_alloc_buffer(struct vcpu *v) > */ > static int vcpu_teardown(struct vcpu *v) > { > +#ifdef CONFIG_HAS_VPCI > + for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ ) > + { > + struct vpci_map_task *task = &v->vpci.task[i]; > + > + for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ ) > + rangeset_destroy(task->bars[j].mem); You might want to additionally do: task->bars[j].mem = NULL; > + } > +#endif > + > vmtrace_free_buffer(v); > > return 0; > @@ -467,6 +477,26 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int > vcpu_id) > d->vcpu[prev_id]->next_in_list = v; > } > > +#ifdef CONFIG_HAS_VPCI > + for ( unsigned int i = 0; i < ARRAY_SIZE(v->vpci.task); i++ ) > + { > + struct vpci_map_task *task = &v->vpci.task[i]; > + > + for ( unsigned int j = 0; j < ARRAY_SIZE(task->bars); j++ ) > + { > + struct vpci_bar_map *bar = &task->bars[j]; > + char str[32]; > + > + snprintf(str, sizeof(str), "PCI map vcpu %u task %u BAR %u", > vcpu_id, i, j); > + > + bar->mem = rangeset_new(v->domain, str, RANGESETF_no_print); Not sure there's much point in naming those with that much detail - those are scratch space for mapping calculations. You already pass RANGESETF_no_print, which means the contents of the rangeset won't be dumped, and hence the name is kind of meaningless. I shouldn't have named those either when allocated in bar_add_rangeset(). > + > + if ( !bar->mem ) > + goto fail_sched; > + } > + } > +#endif > + > /* Must be called after making new vcpu visible to for_each_vcpu(). */ > vcpu_check_shutdown(v); > > diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h > index 17cfecb0aabf..afe78b00ffc9 100644 > --- a/xen/include/xen/vpci.h > +++ b/xen/include/xen/vpci.h > @@ -116,7 +116,6 @@ struct vpci { > uint64_t guest_addr; > uint64_t size; > uint64_t resizable_sizes; > - struct rangeset *mem; > enum { > VPCI_BAR_EMPTY, > VPCI_BAR_IO, > @@ -207,14 +206,23 @@ struct vpci { > #endif > }; > > +#ifdef __XEN__ > struct vpci_vcpu { > /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */ > const struct pci_dev *pdev; > - uint16_t cmd; > - bool rom_only : 1; > + struct domain *domain; > + unsigned int nr_pending_ops; Not sure you really need a pending ops counter? Hard to tell without seeing the code that makes use of it. > + struct vpci_map_task { > + struct vpci_bar_map { > + uint64_t addr; > + uint64_t guest_addr; > + struct rangeset *mem; > + } bars[PCI_HEADER_NORMAL_NR_BARS + 1]; > + uint16_t cmd; > + bool rom_only : 1; > + } task[2]; Don't you need a way to differentiate between map/unmap operations? Do you plan to use slot 0 as unmap and slot 1 as map? Or would you rather introduce a boolean field to signal it in struct vpci_map_task? Overall seems OK, but obviously it all needs to fit together with the current code :). Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.