[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH V1 01/12] hvm/ioreq: Make x86's IOREQ feature common



On Tue, 11 Aug 2020, Julien Grall wrote:
> On 11/08/2020 00:34, Stefano Stabellini wrote:
> > On Mon, 10 Aug 2020, Julien Grall wrote:
> > > On 07/08/2020 00:48, Stefano Stabellini wrote:
> > > > On Thu, 6 Aug 2020, Julien Grall wrote:
> > > > > On 06/08/2020 01:37, Stefano Stabellini wrote:
> > > > > > On Wed, 5 Aug 2020, Julien Grall wrote:
> > > > > > > On 04/08/2020 20:11, Stefano Stabellini wrote:
> > > > > > > > On Tue, 4 Aug 2020, Julien Grall wrote:
> > > > > > > > > On 04/08/2020 12:10, Oleksandr wrote:
> > > > > > > > > > On 04.08.20 10:45, Paul Durrant wrote:
> > > > > > > > > > > > +static inline bool hvm_ioreq_needs_completion(const
> > > > > > > > > > > > ioreq_t
> > > > > > > > > > > > *ioreq)
> > > > > > > > > > > > +{
> > > > > > > > > > > > +    return ioreq->state == STATE_IOREQ_READY &&
> > > > > > > > > > > > +           !ioreq->data_is_ptr &&
> > > > > > > > > > > > +           (ioreq->type != IOREQ_TYPE_PIO || ioreq->dir
> > > > > > > > > > > > !=
> > > > > > > > > > > > IOREQ_WRITE);
> > > > > > > > > > > > +}
> > > > > > > > > > > I don't think having this in common code is correct. The
> > > > > > > > > > > short-cut
> > > > > > > > > > > of
> > > > > > > > > > > not
> > > > > > > > > > > completing PIO reads seems somewhat x86 specific.
> > > > > > > > > 
> > > > > > > > > Hmmm, looking at the code, I think it doesn't wait for PIO
> > > > > > > > > writes
> > > > > > > > > to
> > > > > > > > > complete
> > > > > > > > > (not read). Did I miss anything?
> > > > > > > > > 
> > > > > > > > > > Does ARM even
> > > > > > > > > > > have the concept of PIO?
> > > > > > > > > > 
> > > > > > > > > > I am not 100% sure here, but it seems that doesn't have.
> > > > > > > > > 
> > > > > > > > > Technically, the PIOs exist on Arm, however they are accessed
> > > > > > > > > the
> > > > > > > > > same
> > > > > > > > > way
> > > > > > > > > as
> > > > > > > > > MMIO and will have a dedicated area defined by the HW.
> > > > > > > > > 
> > > > > > > > > AFAICT, on Arm64, they are only used for PCI IO Bar.
> > > > > > > > > 
> > > > > > > > > Now the question is whether we want to expose them to the
> > > > > > > > > Device
> > > > > > > > > Emulator
> > > > > > > > > as
> > > > > > > > > PIO or MMIO access. From a generic PoV, a DM shouldn't have to
> > > > > > > > > care
> > > > > > > > > about
> > > > > > > > > the
> > > > > > > > > architecture used. It should just be able to request a given
> > > > > > > > > IOport
> > > > > > > > > region.
> > > > > > > > > 
> > > > > > > > > So it may make sense to differentiate them in the common ioreq
> > > > > > > > > code as
> > > > > > > > > well.
> > > > > > > > > 
> > > > > > > > > I had a quick look at QEMU and wasn't able to tell if PIOs and
> > > > > > > > > MMIOs
> > > > > > > > > address
> > > > > > > > > space are different on Arm as well. Paul, Stefano, do you know
> > > > > > > > > what
> > > > > > > > > they
> > > > > > > > > are
> > > > > > > > > doing?
> > > > > > > > 
> > > > > > > > On the QEMU side, it looks like PIO (address_space_io) is used
> > > > > > > > in
> > > > > > > > connection with the emulation of the "in" or "out" instructions,
> > > > > > > > see
> > > > > > > > ioport.c:cpu_inb for instance. Some parts of PCI on QEMU emulate
> > > > > > > > PIO
> > > > > > > > space regardless of the architecture, such as
> > > > > > > > hw/pci/pci_bridge.c:pci_bridge_initfn.
> > > > > > > > 
> > > > > > > > However, because there is no "in" and "out" on ARM, I don't
> > > > > > > > think
> > > > > > > > address_space_io can be accessed. Specifically, there is no
> > > > > > > > equivalent
> > > > > > > > for target/i386/misc_helper.c:helper_inb on ARM.
> > > > > > > 
> > > > > > > So how PCI I/O BAR are accessed? Surely, they could be used on
> > > > > > > Arm,
> > > > > > > right?
> > > > > > 
> > > > > > PIO is also memory mapped on ARM and it seems to have its own MMIO
> > > > > > address window.
> > > > > This part is already well-understood :). However, this only tell us
> > > > > how an
> > > > > OS
> > > > > is accessing a PIO.
> > > > > 
> > > > > What I am trying to figure out is how the hardware (or QEMU) is meant
> > > > > to
> > > > > work.
> > > > > 
> > > > >   From my understanding, the MMIO access will be received by the
> > > > > hostbridge
> > > > > and
> > > > > then forwarded to the appropriate PCI device. The two questions I am
> > > > > trying to
> > > > > answer is: How the I/O BARs are configured? Will it contain an MMIO
> > > > > address or
> > > > > an offset?
> > > > > 
> > > > > If the answer is the latter, then we will need PIO because a DM will
> > > > > never
> > > > > see
> > > > > the MMIO address (the hostbridge will be emulated in Xen).
> > > > 
> > > > Now I understand the question :-)
> > > > 
> > > > This is the way I understand it works. Let's say that the PIO aperture
> > > > is 0x1000-0x2000 which is aliased to 0x3eff0000-0x3eff1000.
> > > > 0x1000-0x2000 are addresses that cannot be accessed directly.
> > > > 0x3eff0000-0x3eff1000 is the range that works.
> > > > 
> > > > A PCI device PIO BAR will have an address in the 0x1000-0x2000 range,
> > > > for instance 0x1100.
> 
> Are you sure about this?

I am pretty sure, but only from reading the code. It would be great if
somebody ran QEMU and actually tested it. This is important because it
could make the whole discussion moot :-)


> > > > However, when the operating system access 0x1100, it will issue a read
> > > > to 0x3eff0100.
> > > > 
> > > > Xen will trap the read to 0x3eff0100 and send it to QEMU.
> > > > 
> > > > QEMU has to know that 0x3eff0000-0x3eff1000 is the alias to the PIO
> > > > aperture and that 0x3eff0100 correspond to PCI device foobar. Similarly,
> > > > QEMU has also to know the address range of the MMIO aperture and its
> > > > remappings, if any (it is possible to have address remapping for MMIO
> > > > addresses too.)
> > > > 
> > > > I think today this information is "built-in" QEMU, not configurable. It
> > > > works fine because *I think* the PCI aperture is pretty much the same on
> > > > x86 boards, at least the one supported by QEMU for Xen.
> > > 
> > > Well on x86, the OS will access PIO using inb/outb. So the address
> > > received by
> > > Xen is 0x1000-0x2000 and then forwarded to the DM using the PIO type.
> > > 
> > > > On ARM, I think we should explicitly declare the PCI MMIO aperture and
> > > > its alias/address-remapping. When we do that, we can also declare the
> > > > PIO aperture and its alias/address-remapping.
> > > 
> > > Well yes, we need to define PCI MMIO and PCI I/O region because the guest
> > > OS
> > > needs to know them.
> > 
> > [1]
> > (see below)
> > 
> > 
> > > However, I am unsure how this would help us to solve the question whether
> > > access to the PCI I/O aperture should be sent as a PIO or MMIO.
> > > 
> > > Per what you wrote, the PCI I/O Bar would be configured with the range
> > > 0x1000-0x2000. So a device emulator (this may not be QEMU and only emulate
> > > one
> > > PCI device!!) will only see that range.
> > > 
> > > How does the device-emulator then know that it needs to watch the region
> > > 0x3eff0000-0x3eff1000?
> > 
> > It would know because the PCI PIO aperture, together with the alias, are
> > specified [1].
> 
> Are you suggesting fix it in the ABI or pass it as runtime information to the
> Device Emulator?

I am suggesting of "fixing" it in the ABI. Whether we pass it at
runtime or not is less important I think.
 
 
> > > It feels to me that it would be easier/make more sense if the DM only say
> > > "I
> > > want to watch the PIO range 0x1000-0x2000". So Xen would be in charge to
> > > do
> > > the translation between the OS view and the DM view.
> > > 
> > > This also means a DM would be completely arch-agnostic. This would follow
> > > the
> > > HW where you can plug your PCI card on any HW.
> > 
> > As you know, PIO access is actually not modelled by QEMU for ARM
> > targets. I worry about the long term stability of it, given that it is
> > untested.  I.e. qemu-system-aarch64 could have a broken PIO emulation
> > and nobody would find out except for us when we send ioreqs to it.
> 
> There are multiple references of PIO in the QEMU for Arm (see hw/arm/virt.c).
> So what do you mean by not modelled?

I mean that PIO is only emulated as MMIO region, not as port-mapped I/O.


> > Thinking from a Xen/Emulator interface on ARM, is it wise to rely on an
> > access-type that doesn't exist on the architecture?
> 
> The architecture doesn't define an instruction to access PIO, however this
> doesn't mean such access doesn't exist on the platform.
> 
> For instance, PCI device may have I/O BAR. On Arm64, the hostbridge will be
> responsible to do the translation between the MMIO access to a PIO access for
> the PCI device.

As far as I understand the host bridge is responsible for any
translations between the host address space and the address space of
devices. Even for MMIO addresses there are translations. The hostbridge
doesn't do any port-mapped I/O to communicate with the device. There is
a different protocol for those communications. Port-mapped I/O is how it
is exposed to the host by the hostbridge.

If we wanted to emulate the hostbridge/device interface properly we
would end up with something pretty different.


> I have the impression that we disagree in what the Device Emulator is meant to
> do. IHMO, the goal of the device emulator is to emulate a device in an
> arch-agnostic way.

That would be great in theory but I am not sure it is achievable: if we
use an existing emulator like QEMU, even a single device has to fit
into QEMU's view of the world, which makes assumptions about host
bridges and apertures. It is impossible today to build QEMU in an
arch-agnostic way, it has to be tied to an architecture.

I realize we are not building this interface for QEMU specifically, but
even if we try to make the interface arch-agnostic, in reality the
emulators won't be arch-agnostic. If we send a port-mapped I/O request
to qemu-system-aarch64 who knows what is going to happen: it is a code
path that it is not explicitly tested.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.