[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] xen-pciback: allow compiling on other archs than x86



On Tue, 21 Sep 2021, Oleksandr Andrushchenko wrote:
> On 21.09.21 10:09, Juergen Gross wrote:
> > On 21.09.21 09:00, Oleksandr Andrushchenko wrote:
> >>
> >> On 21.09.21 09:49, Juergen Gross wrote:
> >>> On 21.09.21 08:38, Oleksandr Andrushchenko wrote:
> >>>>
> >>>> On 21.09.21 09:07, Juergen Gross wrote:
> >>>>> On 21.09.21 07:51, Oleksandr Andrushchenko wrote:
> >>>>>>
> >>>>>> On 21.09.21 08:20, Juergen Gross wrote:
> >>>>>>> On 21.09.21 01:16, Stefano Stabellini wrote:
> >>>>>>>> On Mon, 20 Sep 2021, Oleksandr Andrushchenko wrote:
> >>>>>>>>> On 20.09.21 14:30, Juergen Gross wrote:
> >>>>>>>>>> On 20.09.21 07:23, Oleksandr Andrushchenko wrote:
> >>>>>>>>>>> Hello, Stefano!
> >>>>>>>>>>>
> >>>>>>>>>>> On 18.09.21 00:45, Stefano Stabellini wrote:
> >>>>>>>>>>>> Hi Oleksandr,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Why do you want to enable pciback on ARM? Is it only to 
> >>>>>>>>>>>> "disable" a PCI
> >>>>>>>>>>>> device in Dom0 so that it can be safely assigned to a DomU?
> >>>>>>>>>>> Not only that
> >>>>>>>>>>>>
> >>>>>>>>>>>> I am asking because actually I don't think we want to enable the 
> >>>>>>>>>>>> PV PCI
> >>>>>>>>>>>> backend feature of pciback on ARM, right? That would clash with 
> >>>>>>>>>>>> the PCI
> >>>>>>>>>>>> assignment work you have been doing in Xen. They couldn't both 
> >>>>>>>>>>>> work at
> >>>>>>>>>>>> the same time.
> >>>>>>>>>>> Correct, it is not used
> >>>>>>>>>>>>
> >>>>>>>>>>>> If we only need pciback to "park" a device in Dom0, wouldn't it 
> >>>>>>>>>>>> be
> >>>>>>>>>>>> possible and better to use pci-stub instead?
> >>>>>>>>>>>
> >>>>>>>>>>> Not only that, so pci-stub is not enough
> >>>>>>>>>>>
> >>>>>>>>>>> The functionality which is implemented by the pciback and the 
> >>>>>>>>>>> toolstack
> >>>>>>>>>>> and which is relevant/missing/needed for ARM:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. pciback is used as a database for assignable PCI devices, e.g. 
> >>>>>>>>>>> xl
> >>>>>>>>>>>          pci-assignable-{add|remove|list} manipulates that list. 
> >>>>>>>>>>> So, whenever the
> >>>>>>>>>>>          toolstack needs to know which PCI devices can be passed 
> >>>>>>>>>>> through it reads
> >>>>>>>>>>>          that from the relevant sysfs entries of the pciback.
> >>>>>>>>>>>
> >>>>>>>>>>> 2. pciback is used to hold the unbound PCI devices, e.g. when 
> >>>>>>>>>>> passing through
> >>>>>>>>>>>          a PCI device it needs to be unbound from the relevant 
> >>>>>>>>>>> device driver and bound
> >>>>>>>>>>>          to pciback (strictly speaking it is not required that 
> >>>>>>>>>>> the device is bound to
> >>>>>>>>>>>          pciback, but pciback is again used as a database of the 
> >>>>>>>>>>> passed through PCI
> >>>>>>>>>>>          devices, so we can re-bind the devices back to their 
> >>>>>>>>>>> original drivers when
> >>>>>>>>>>>          guest domain shuts down)
> >>>>>>>>>>>
> >>>>>>>>>>> 3. Device reset
> >>>>>>>>>>>
> >>>>>>>>>>> We have previously discussed on xen-devel ML possible solutions 
> >>>>>>>>>>> to that as from the
> >>>>>>>>>>> above we see that pciback functionality is going to be only 
> >>>>>>>>>>> partially used on Arm.
> >>>>>>>>>>>
> >>>>>>>>>>> Please see [1] and [2]:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. It is not acceptable to manage the assignable list in Xen 
> >>>>>>>>>>> itself
> >>>>>>>>>>>
> >>>>>>>>>>> 2. pciback can be split into two parts: PCI assignable/bind/reset 
> >>>>>>>>>>> handling and
> >>>>>>>>>>> the rest like vPCI etc.
> >>>>>>>>>>>
> >>>>>>>>>>> 3. pcifront is not used on Arm
> >>>>>>>>>>
> >>>>>>>>>> It is neither in x86 PVH/HVM guests.
> >>>>>>>>> Didn't know that, thank you for pointing
> >>>>>>>>>>
> >>>>>>>>>>> So, limited use of the pciback is one of the bricks used to 
> >>>>>>>>>>> enable PCI passthrough
> >>>>>>>>>>> on Arm. It was enough to just re-structure the driver and have it 
> >>>>>>>>>>> run on Arm to achieve
> >>>>>>>>>>> all the goals above.
> >>>>>>>>>>>
> >>>>>>>>>>> If we still think it is desirable to break the pciback driver 
> >>>>>>>>>>> into "common" and "pcifront specific"
> >>>>>>>>>>> parts then it can be done, yet the patch is going to be the very 
> >>>>>>>>>>> first brick in that building.
> >>>>>>>>>>
> >>>>>>>>>> Doing this split should be done, as the pcifront specific part 
> >>>>>>>>>> could be
> >>>>>>>>>> omitted on x86, too, in case no PV guests using PCI passthrough 
> >>>>>>>>>> have to
> >>>>>>>>>> be supported.
> >>>>>>>>> Agree, that the final solution should have the driver split
> >>>>>>>>>>
> >>>>>>>>>>> So, I think this patch is still going to be needed besides which 
> >>>>>>>>>>> direction we take.
> >>>>>>>>>>
> >>>>>>>>>> Some kind of this patch, yes. It might look different in case the 
> >>>>>>>>>> split
> >>>>>>>>>> is done first.
> >>>>>>>>>>
> >>>>>>>>>> I don't mind doing it in either sequence.
> >>>>>>>>>>
> >>>>>>>>> With this patch we have Arm on the same page as the above mentioned 
> >>>>>>>>> x86 guests,
> >>>>>>>>>
> >>>>>>>>> e.g. the driver has unused code, but yet allows Arm to function now.
> >>>>>>>>>
> >>>>>>>>> At this stage of PCI passthrough on Arm it is yet enough. Long 
> >>>>>>>>> term, when
> >>>>>>>>>
> >>>>>>>>> the driver gets split, Arm will benefit from that split too, but 
> >>>>>>>>> unfortunately I do not
> >>>>>>>>>
> >>>>>>>>> have enough bandwidth for that piece of work at the moment.
> >>>>>>>>
> >>>>>>>> That's fair and I don't want to scope-creep this simple patch asking 
> >>>>>>>> for
> >>>>>>>> an enormous rework. At the same time I don't think we should enable 
> >>>>>>>> the
> >>>>>>>> whole of pciback on ARM because it would be erroneous and confusing.
> >>>>>>
> >>>>>> As the first stage before the driver is split or ifdef's used - can we 
> >>>>>> take the patch
> >>>>>> as is now? In either way we chose this needs to be done, e.g. enable 
> >>>>>> compiling
> >>>>>> for other architectures and common code move.
> >>>>>
> >>>>> Fine with me in principle. I need to take a more thorough look
> >>>>> at the patch, though.
> >>>> Of course
> >>>>>
> >>>>>>
> >>>>>>>>
> >>>>>>>> I am wonder if there is a simple:
> >>>>>>>>
> >>>>>>>> if (!xen_pv_domain())
> >>>>>>>>         return;
> >>>>>>>>
> >>>>>>>> That we could add in a couple of places in pciback to stop it from
> >>>>>>>> initializing the parts we don't care about. Something along these 
> >>>>>>>> lines
> >>>>>>>> (untested and probably incomplete).
> >>>>>>>>
> >>>>>>>> What do you guys think?
> >>>>>>>
> >>>>>>> Uh no, not in this way, please. This will kill pci passthrough on x86
> >>>>>>> with dom0 running as PVH. I don't think this is working right now, but
> >>>>>>> adding more code making it even harder to work should be avoided.
> >>>>>>>
> >>>>>>>> diff --git a/drivers/xen/xen-pciback/xenbus.c 
> >>>>>>>> b/drivers/xen/xen-pciback/xenbus.c
> >>>>>>>> index da34ce85dc88..991ba0a9b359 100644
> >>>>>>>> --- a/drivers/xen/xen-pciback/xenbus.c
> >>>>>>>> +++ b/drivers/xen/xen-pciback/xenbus.c
> >>>>>>>> @@ -15,6 +15,7 @@
> >>>>>>>>      #include <xen/xenbus.h>
> >>>>>>>>      #include <xen/events.h>
> >>>>>>>>      #include <xen/pci.h>
> >>>>>>>> +#include <xen/xen.h>
> >>>>>>>>      #include "pciback.h"
> >>>>>>>>        #define INVALID_EVTCHN_IRQ  (-1)
> >>>>>>>> @@ -685,8 +686,12 @@ static int xen_pcibk_xenbus_probe(struct 
> >>>>>>>> xenbus_device *dev,
> >>>>>>>>                      const struct xenbus_device_id *id)
> >>>>>>>>      {
> >>>>>>>>          int err = 0;
> >>>>>>>> -    struct xen_pcibk_device *pdev = alloc_pdev(dev);
> >>>>>>>> +    struct xen_pcibk_device *pdev;
> >>>>>>>> +
> >>>>>>>> +    if (!xen_pv_domain())
> >>>>>>>> +        return 0;
> >>>>>>>>      +    pdev = alloc_pdev(dev);
> >>>>>>>
> >>>>>>> This hunk isn't needed, as with bailing out of 
> >>>>>>> xen_pcibk_xenbus_register
> >>>>>>> early will result in xen_pcibk_xenbus_probe never being called.
> >>>>>>>
> >>>>>>>>          if (pdev == NULL) {
> >>>>>>>>              err = -ENOMEM;
> >>>>>>>>              xenbus_dev_fatal(dev, err,
> >>>>>>>> @@ -743,6 +748,9 @@ const struct xen_pcibk_backend *__read_mostly 
> >>>>>>>> xen_pcibk_backend;
> >>>>>>>>        int __init xen_pcibk_xenbus_register(void)
> >>>>>>>>      {
> >>>>>>>> +    if (!xen_pv_domain())
> >>>>>>>> +        return 0;
> >>>>>>>> +
> >>>>>>>
> >>>>>>> Use #ifdef CONFIG_X86 instead.
> >>>>>>
> >>>>>> The title of this patch says that we want to allow this driver for 
> >>>>>> other archs
> >>>>>> and now we want to introduce "#ifdef CONFIG_X86" which doesn't sound
> >>>>>> right with that respect. Instead, we may want having something like a
> >>>>>> dedicated gate for this, e.g. "#ifdef 
> >>>>>> CONFIG_XEN_PCIDEV_BACKEND_SUPP_PV"
> >>>>>> or something which is architecture agnostic.
> >>>>>
> >>>>> Something like that, yes. But I'd rather use CONFIG_XEN_PCIDEV_BACKEND
> >>>>> acting as this gate and introduce CONFIG_XEN_PCI_STUB for the stub
> >>>>> functionality needed on Arm. XEN_PCIDEV_BACKEND would depend on X86 and
> >>>>> select XEN_PCI_STUB, while on Arm XEN_PCI_STUB could be configured if
> >>>>> wanted. The splitting of the driver can still be done later.
> >>>>
> >>>> Hm, pciback is now compiled when CONFIG_XEN_PCIDEV_BACKEND is enabled
> >>>> and we want to skip some parts of its code when CONFIG_XEN_PCI_STUB is 
> >>>> set.
> >>>> So, I imagine that for x86 we just enable CONFIG_XEN_PCIDEV_BACKEND and 
> >>>> the
> >>>> driver compiles in its current state. For Arm we enable both 
> >>>> CONFIG_XEN_PCIDEV_BACKEND
> >>>> and CONFIG_XEN_PCI_STUB, so part of the driver is not compiled.
> >>>
> >>> No, I'd rather switch to compiling xen-pciback when CONFIG_XEN_PCI_STUB
> >>> is set and compile only parts of it when CONFIG_XEN_PCIDEV_BACKEND is
> >>> not set (this will be the case on Arm).
> >>
> >> But this will require that the existing kernel configurations out there 
> >> have to additionally define CONFIG_XEN_PCI_STUB to get what they had 
> >> before with simply enabling CONFIG_XEN_PCIDEV_BACKEND. My point was that 
> >> it is probably desirable not to break
> >> the things while doing the split/re-work.
> >
> > By letting XEN_PCIDEV_BACKEND select XEN_PCI_STUB this won't happen.
> Indeed
> >
> >> I also thought that "compile only parts of it when 
> >> CONFIG_XEN_PCIDEV_BACKEND is not set"
> >> may have more code gated rather than with gating unwanted code with 
> >> CONFIG_XEN_PCI_STUB.
> >> I am not quite sure about this though.
> >
> > This would be a very weird semantics of XEN_PCI_STUB, as the stub part
> > is needed on X86 and on Arm.
> >
> > Gating could even be done with Stefano's patch just by replacing his
> > "!xen_pv_domain()" tests with "!IS_ENABLED(CONFIG_XEN_PCIDEV_BACKEND)".
> 
> Makes sense.
> 
> Another question if we do not want the code to be compiled or not executed?
> 
> If the later then we can define something like:
> 
> bool need_pv_part(void)
> 
> {
> 
>      return IS_ENABLED(CONFIG_XEN_PCIDEV_BACKEND);
> 
> }
> 
> and then just use need_pv_part() for the checks where it is needed.
> 
> This allows avoiding multiple ifdef's through the code

This is even better.

For my clarity, Oleksandr, are you OK with adding a few need_pv_part()
checks through the code as part of this series so that the PV PCI
backend is not initialized?

I don't have a good test environment ready for this, so I cannot really
volunteer myself.

I would prefer if we made this change as part of this series so that the
PV PCI backend features doesn't get enabled on ARM, not even temporarily.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.