[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] xen/vioapic: add support for the extended destination ID field



On Wed, 2022-01-26 at 13:47 +0100, Jan Beulich wrote:
> On 25.01.2022 16:13, Roger Pau Monné wrote:
> > On Mon, Jan 24, 2022 at 02:20:47PM +0100, Jan Beulich wrote:
> > > On 20.01.2022 16:23, Roger Pau Monne wrote:
> > > > Such field uses bits 55:48, but for the purposes the register
> > > > will be
> > > > used use bits 55:49 instead. Bit 48 is used to signal an RTE
> > > > entry is
> > > > in remappable format which is not supported by the vIO-APIC.
> > > 
> > > Neither here nor in the cover letter you point at a formal
> > > specification
> > > of this mode of operation.
> > 
> > I'm not aware of any formal specification of this mode, apart from
> > the
> > work done to introduce support in Linux and QEMU:
> > 
> > https://lore.kernel.org/all/20201009104616.1314746-1-dwmw2@xxxxxxxxxxxxx/
> > 
> > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=c1bb5418e
> > 
> > 
> > Adding David in case there's some kind of specification somewhere
> > I'm
> > not aware of.
> > 
> > > What I'm aware of are vague indications of
> > > this mode's existence in some of Intel's chipset data sheets. Yet that
> > > leaves open, for example, whether indeed bit 48 cannot be used here.
> > 
> > Bit 48 cannot be used because it's already used to signal an RTE is in
> > remappable format. We still want to differentiate an RTE entry in
> > remappable format, as it should be possible to expose both the
> > extended ID support and an emulated IOMMU.
> 
> I think I did say so on irc already: There's not really a problem like
> this. For one I wouldn't expect an OS to use this extended ID at the
> same time as having an IOMMU to deal with the width restriction. And
> then, even if they wanted to use both at the same time, they'd simply
> need to care about the specific meaning of this bit themselves: When
> the bit is set, it would be unavoidable to have it (perhaps identity-)
> remapped by the IOMMU.

As you later said, it's too late for bikeshedding that decision. But I
stand by it regardless of the time.

Even by the time *I* made that choice, it was long since established by
Intel. You could make the same argument about their original hardware
design, that the format bit is pointless and that if an OS enables
interrupt remapping, it knows full well when it's going to use it. It
can even be configured in the IOMMU per PCI function.

There is benefit to having a very clear and unambiguous difference
between the MSI formats that isn't entirely dependent on the IOMMU
being configured correctly. And in my case there is *definitely*
benefit to following the precedent already set by Intel in the real
hardware. For me, those outweighed the marginal additional benefit of
going from 15 to 16 bits of APIC ID in the MSI.

> > > > --- a/xen/arch/x86/hvm/vioapic.c
> > > > +++ b/xen/arch/x86/hvm/vioapic.c
> > > > @@ -412,7 +412,8 @@ static void ioapic_inj_irq(
> > > >  
> > > >  static void vioapic_deliver(struct hvm_vioapic *vioapic, unsigned int 
> > > > pin)
> > > >  {
> > > > -    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id;
> > > > +    uint16_t dest = vioapic->redirtbl[pin].fields.dest_id |
> > > > +                    (vioapic->redirtbl[pin].fields.ext_dest_id << 8);
> > > 
> > > What if an existing guest has been writing non-zero in these bits? Can
> > > you really use them here without any further indication by the guest?
> > 
> > Those bits where reserved previously, so no OS should have used them.
> > There are hypervisors already in the field (QEMU/KVM and HyperV) using
> > this mode.
> > 
> > We could add a per-domain option to disable extended ID mode if we are
> > really worried about OSes having used those bits for some reason.
> 
> Generally I think previously ignored bits need to be handled with care.
> If there was a specification, what is being said there might serve as
> a guideline for us. Even if there was just a proper description of the
> EDID field found in recent Intel chipset spec, this might already help
> determining whether we want/need an enable (or disable). But there's
> not even a bit announcing the functionality in, say, the version
> register.

It's not very verbose, but the Extended Destination ID in the I/OAPIC
is at least mentioned in the RTE documentation in the 82806AA datasheet
https://datasheet.octopart.com/FW82806AA-SL3VZ-Intel-datasheet-13695406.pdf 

See page 47, §2.4.10 "Redirection Table High DWord".

The rest you have to kind of piece together from the later
documentation once they actually started *using* it for IRQ remapping.
I think it may also have been used on IA64?

The realisation that we didn't need to have special different code to
compose RTE entries for Compatibility Format vs. Remappable Format, and
that we could just allow the 'upstream' APIC code to compose the MSI
message and then swizzle the bits into the RTE... was rather slow to
come.
https://lore.kernel.org/all/20201024213535.443185-22-dwmw2@xxxxxxxxxxxxx/

Attachment: smime.p7s
Description: S/MIME cryptographic signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.