|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 2/2] x86/msi: Allow writes to registers on the same page as MSI-X table
On Fri, Nov 18, 2022 at 08:20:14AM +0100, Jan Beulich wrote:
> On 17.11.2022 18:31, Marek Marczykowski-Górecki wrote:
> > On Thu, Nov 17, 2022 at 05:34:36PM +0100, Jan Beulich wrote:
> >> On 14.11.2022 20:21, Marek Marczykowski-Górecki wrote:
> >>> --- a/xen/arch/x86/msi.c
> >>> +++ b/xen/arch/x86/msi.c
> >>> @@ -961,6 +961,21 @@ static int msix_capability_init(struct pci_dev *dev,
> >>> domain_crash(d);
> >>> /* XXX How to deal with existing mappings? */
> >>> }
> >>> +
> >>> + /*
> >>> + * If the MSI-X table doesn't span full page(s), map the last
> >>> page for
> >>> + * passthrough accesses.
> >>> + */
> >>> + if ( (msix->nr_entries * PCI_MSIX_ENTRY_SIZE) & (PAGE_SIZE - 1) )
> >>> + {
> >>> + uint64_t entry_paddr = table_paddr + msix->nr_entries *
> >>> PCI_MSIX_ENTRY_SIZE;
> >>> + int idx = msix_get_fixmap(msix, table_paddr, entry_paddr);
> >>> +
> >>> + if ( idx >= 0 )
> >>> + msix->last_table_page = fix_to_virt(idx);
> >>> + else
> >>> + gprintk(XENLOG_ERR, "Failed to map last MSI-X table
> >>> page: %d\n", idx);
> >>> + }
> >>
> >> Could we avoid the extra work if there's only less than one page's
> >> worth of entries for a device? But then again maybe not worth any
> >> extra code, as the same mapping will be re-used anyway due to the
> >> refcounting that's being used.
> >
> > I was considering that, but decided against exactly because of
> > msix_get_fixmap() reusing existing mappings.
> >
> >> Makes me think of another aspect though: Don't we also need to
> >> handle stuff living on the same page as the start of the table, if
> >> that doesn't start at a page boundary?
> >
> > I have considered that, but decided against given every single device I
> > tried have MSI-X table at the page boundary. But if you prefer, I can
> > add such handling too (will require adding another variable to the
> > arch_msix structure - to store the fixmap location).
>
> To limit growth of the struct, please at least consider storing the fixmap
> indexes instead of full pointers.
Ok.
> >>> @@ -1090,6 +1105,12 @@ static void _pci_cleanup_msix(struct arch_msix
> >>> *msix)
> >>> WARN();
> >>> msix->table.first = 0;
> >>> msix->table.last = 0;
> >>> + if ( msix->last_table_page )
> >>> + {
> >>> + msix_put_fixmap(msix,
> >>> + virt_to_fix((unsigned
> >>> long)msix->last_table_page));
> >>> + msix->last_table_page = 0;
> >>
> >> To set a pointer please use NULL.
> >
> > Ok.
> >
> >> Overall it looks like you're dealing with the issue for HVM only.
> >> You will want to express this in the title, perhaps by using x86/hvm:
> >> as the prefix. But then the question of course is whether this couldn't
> >> be dealt with in/from mmio_ro_emulated_write(), which handles both HVM
> >> and PV.
> >
> > The issue is correlating BAR mapping location with guest's view.
> > Writable BAR areas are mapped (by qemu) via xc_domain_memory_mapping(), but
> > that fails for read-only pages (and indeed, qemu doesn't attempt to do
> > that for the pages with the MSI-X table). Lacking that, I need to use
> > msixtbl_entry->gtable, which is HVM-only thing.
> >
> > In fact there is another corner case I don't handle here: guest
> > accessing those registers when MSI-X is disabled. In that case, there is
> > no related msixtbl_entry, so I can't correlate the access, but the
> > page(s) is still read-only, so direct mapping would fail. In practice,
> > such access will trap into qemu, which will complain "Should not
> > read/write BAR through QEMU". I have seen this happening several times
> > when developing the series (due to bugs in my patches), but I haven't
> > found any case where it would happen with the final patch version.
> > In fact, I have considered handling this whole thing via qemu (as it
> > knows better where BAR live from the guest PoV), but stubdomain still
> > don't have write access to that pages, so that would need to be trapped
> > (for the second time) by Xen anyway.
> >
> > For the PV case, I think this extra translation wouldn't be necessary as
> > BAR are mapped at their actual location, right?
>
> I think so, yes.
>
> > But then, it makes it
> > rather different implementation (separate feature), than just having a
> > common one for PV and HVM.
>
> It would be different, yes, and if - as you explain above - there are
> technical reasons why it cannot be shared, then so be it. Mentioning
> this in the description may be worthwhile, or else the same question
> may be asked again (even by me, in case I forget part of the discussion
> by the time I look at a particular future version).
Ok, I'll extend the commit message.
> >> Which in turn raises the question: Do you need to handle reads
> >> in the new code in the first place?
> >
> > The page not being mapped is also the reason why I do need to handle
> > reads too.
>
> Just for my own clarity: You mean "not mapped to qemu" here?
No, to the HVM domain (in p2m). Xen (outside of MSI-X specific code for
HVM) doesn't know where those reads should be from.
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |