[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: SR-IOV: do we need to virtualize in Xen or rely on Dom0?
On Thu, Jun 10, 2021 at 10:01:16AM +0000, Oleksandr Andrushchenko wrote: > Hello, Roger! > > On 10.06.21 10:54, Roger Pau Monné wrote: > > On Fri, Jun 04, 2021 at 06:37:27AM +0000, Oleksandr Andrushchenko wrote: > >> Hi, all! > >> > >> While working on PCI SR-IOV support for ARM I started porting [1] on top > >> of current PCI on ARM support [2]. The question I have for this series > >> is if we really need emulating SR-IOV code in Xen? > >> > >> I have implemented a PoC for SR-IOV on ARM [3] (please see the top 2 > >> patches) > >> and it "works for me": MSI support is still WIP, but I was able to see that > >> VFs are properly seen in the guest and BARs are properly programmed in p2m. > >> > >> What I can't fully understand is if we can live with this approach or there > >> are use-cases I can't see. > >> > >> Previously I've been told that this approach might not work on FreeBSD > >> running > >> as Domain-0, but is seems that "PCI Passthrough is not supported > >> (Xen/FreeBSD)" > >> anyways [4]. > > PCI passthorgh is not supported on FreeBSD dom0 because PCI > > passthrough is not supported by Xen itself when using a PVH dom0, and > > that's the only mode FreeBSD dom0 can use. > > So, it is still not clear to me: how and if PCI passthrough is supported > > on FreeBSD, what are the scenarios and requirements for that? > > > > > PHYSDEVOP_pci_device_add can be added to FreeBSD, so it could be made > > to work. I however think this is not the proper way to implement > > SR-IOV support. > > I was not able to find any support for PHYSDEVOP_XXX in FreeBSD code, > > could you please point me to where are these used? Those are not used on FreeBSD, because x86 PVHv2 dom0 doesn't implement them anymore. They are implemented on Linux for x86 PV dom0, AFAIK Arm doesn't use them either. > If they are not, so how Xen under FreeBSD knows about PCI devices? Xen scans the PCI bus itself, see scan_pci_devices. > I am trying to extrapolate my knowledge of how Linux does that > > (during PCI enumeration in Domain-0 we use hypercalls) > > > > >> I also see ACRN hypervisor [5] implements SR-IOV inside it which makes > >> me think I > >> miss some important use-case on x86 though. > >> > >> I would like to ask for any advise with SR-IOV in hypervisor respect, > >> any pointers > >> to documentation or any other source which might be handy in deciding if > >> we do > >> need SR-IOV complexity in Xen. > >> > >> And it does bring complexity if you compare [1] and [3])... > >> > >> A bit of technical details on the approach implemented [3]: > >> 1. We rely on PHYSDEVOP_pci_device_add > >> 2. We rely on Domain-0 SR-IOV drivers to instantiate VFs > >> 3. BARs are programmed in p2m implementing guest view on those (we have > >> extended > >> vPCI code for that and this path is used for both "normal" devices and > >> VFs the same way) > >> 4. No need to trap PCI_SRIOV_CTRL > >> 5. No need to wait 100ms in Xen before attempting to access VF registers > >> when > >> enabling virtual functions on the PF - this is handled by Domain-0 itself. > > I think the SR-IOV capability should be handled like any other PCI > > capability, ie: like we currently handle MSI or MSI-X in vPCI. > > > > It's likely that using some kind of hypercall in order to deal with > > SR-IOV could make this easier to implement in Xen, but that just adds > > more code to all OSes that want to run as the hardware domain. > > I didn't introduce any new, but PHYSDEVOP_pci_device_add was enough. Well, that would be 'new' on x86 PVH or Arm, as they don't implement any PHYSDEVOP at the moment. Long term we might need an hypercall to report dynamic MCFG regions, but I haven't got around to it yet (and haven't found any system that reports extra MCFG regions from ACPI AML). > The rest I did in Xen itself wrt SR-IOV. > > > > > OTOH if we properly trap accesses to the SR-IOV capability (like it > > was proposed in [1] from your references) we won't have to modify OSes > > that want to run as hardware domains in order to handle SR-IOV devices. > > Out of curiosity, could you please name a few? I do understand that > > we do want to support unmodified OSes and this is indeed important. > > But, still what are the other OSes which do support Xen + PCI passthrough? NetBSD PV dom0 does support PCI passthrough, but I'm not sure that's relevant. We shouldn't focus on current users to come up with an interface, but rather think how we want that interface to be. As I said on the previous email my opinion is that unless not technically possible we should just trap accesses to the SR-IOV capability like we do for MSI(-X) and handle it transparently from a guest PoV. > > > > IMO going for the hypercall option seems easier now, but adds a burden > > to all OSes that want to manage SR-IOV devices that will hurt us long > > term. > > Again, I was able to make it somewhat work with PHYSDEVOP_pci_device_add only. Sure, that's how it works on x86 PV hardware domain, so it's certainly possible. My comments to avoid that route is not because it's not technically feasible, but because I don't like the approach. So far we have avoided PVH from having to implement any PHYSDEVOP hypercall, and that's a design decision, not a coincidence. I'm in favor of using the existing hardware interfaces for guests instead of introducing custom Xen ones when technically feasible. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |