[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 06/10] vpci: Make every domain handle its own BARs


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Fri, 13 Nov 2020 10:46:39 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YZiq79gaOgQ3DZieciyOSU0byE85sGosJ0zutjqC+rw=; b=P8no0Olw1aD63CcEcgGVUgbWI4mWYHDjjhqXAsbzjVhy4Z6ymrKYMDU6CA5ZqTXDyhxj6oONzstnXF3G/Hcj74jwauezQTV96YulhIRk0/nI6Qz1LZP9iTaB3Brzegp95bDQjxLazEyfcVPUGppKM0jjY5w2BOiwm7KOrOip6bH+6+Inr0IyZ0jtJ3p7j83YMVgbAfEPp899OI/O1Llw9BzqYtSs2sJFyAlQyLnICuQsDOfLIjl500017El2Uj4Fj2W1vw4L/fmtOFiNeQ8THrFEny2y4PLnEXDiDuUmMM5DhLSrYe1sOs2a2+iuJrz9aUgP1Lm3Ed5Z8tojYdhvBQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dc8T5ozEp9nnBTFGk80WxbhNgr8hRnf41mK7wS0OvmzOr+0u4jXNfYfvhgwtvYl3QK+vHQBaOZmC/Mdhu3PVMQqO9JWlalLpIxN1B2grqa5p19VcQWFbXVjP5xQAMYSnJJQ5DrDjWkSkNviw1NfulOUZ5MDvkLOIoJ8lNPQHwyW+GwNIl5TDhEUkTMxrpRw3ebYWd14KhSqK/V/Q/q3XlMB/eI7N3TyHXRt46Y3p5qioGDz76C+xQ0cAneY/MLaZfeyw/Nz13YIJdJYSUO6QnAbmwv+/cKGGiS0AVUWtwYxKSmEt7lx4FOj2YNt4/b8Yx3Uagm6MYYlHNicruX1gXg==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=epam.com;
  • Cc: Oleksandr Andrushchenko <andr2000@xxxxxxxxx>, "Rahul.Singh@xxxxxxx" <Rahul.Singh@xxxxxxx>, "Bertrand.Marquis@xxxxxxx" <Bertrand.Marquis@xxxxxxx>, "julien.grall@xxxxxxx" <julien.grall@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "iwj@xxxxxxxxxxxxxx" <iwj@xxxxxxxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Fri, 13 Nov 2020 10:47:12 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHWtpbzyJg77SbpvkSRWLKjWy/3PanEQmgAgAA8YoCAABlOgIABCD8AgABBOQCAAAXIAA==
  • Thread-topic: [PATCH 06/10] vpci: Make every domain handle its own BARs

On 11/13/20 12:25 PM, Jan Beulich wrote:
> On 13.11.2020 07:32, Oleksandr Andrushchenko wrote:
>> On 11/12/20 4:46 PM, Roger Pau Monné wrote:
>>> On Thu, Nov 12, 2020 at 01:16:10PM +0000, Oleksandr Andrushchenko wrote:
>>>> On 11/12/20 11:40 AM, Roger Pau Monné wrote:
>>>>> On Mon, Nov 09, 2020 at 02:50:27PM +0200, Oleksandr Andrushchenko wrote:
>>>>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>>>> +static uint32_t bar_read_dispatch(const struct pci_dev *pdev, unsigned 
>>>>>> int reg,
>>>>>> +                                  void *data)
>>>>>> +{
>>>>>> +    struct vpci_bar *vbar, *bar = data;
>>>>>> +
>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>> +        return bar_read_hwdom(pdev, reg, data);
>>>>>> +
>>>>>> +    vbar = get_vpci_bar(current->domain, pdev, bar->index);
>>>>>> +    if ( !vbar )
>>>>>> +        return ~0;
>>>>>> +
>>>>>> +    return bar_read_guest(pdev, reg, vbar);
>>>>>> +}
>>>>>> +
>>>>>> +static void bar_write_dispatch(const struct pci_dev *pdev, unsigned int 
>>>>>> reg,
>>>>>> +                               uint32_t val, void *data)
>>>>>> +{
>>>>>> +    struct vpci_bar *bar = data;
>>>>>> +
>>>>>> +    if ( is_hardware_domain(current->domain) )
>>>>>> +        bar_write_hwdom(pdev, reg, val, data);
>>>>>> +    else
>>>>>> +    {
>>>>>> +        struct vpci_bar *vbar = get_vpci_bar(current->domain, pdev, 
>>>>>> bar->index);
>>>>>> +
>>>>>> +        if ( !vbar )
>>>>>> +            return;
>>>>>> +        bar_write_guest(pdev, reg, val, vbar);
>>>>>> +    }
>>>>>> +}
>>>>> You should assign different handlers based on whether the domain that
>>>>> has the device assigned is a domU or the hardware domain, rather than
>>>>> doing the selection here.
>>>> Hm, handlers are assigned once in init_bars and this function is only 
>>>> called
>>>>
>>>> for hwdom, so there is no way I can do that for the guests. Hence, the 
>>>> dispatcher
>>> I think we might want to reset the vPCI handlers when a devices gets
>>> assigned and deassigned.
>> In ARM case init_bars is called too early: PCI device assignment is currently
>>
>> initiated by Domain-0' kernel and is done *before* PCI devices are given 
>> memory
>>
>> ranges and BARs assigned:
>>
>> [    0.429514] pci_bus 0000:00: root bus resource [bus 00-ff]
>> [    0.429532] pci_bus 0000:00: root bus resource [io 0x0000-0xfffff]
>> [    0.429555] pci_bus 0000:00: root bus resource [mem 0xfe200000-0xfe3fffff]
>> [    0.429575] pci_bus 0000:00: root bus resource [mem 0x30000000-0x37ffffff]
>> [    0.429604] pci_bus 0000:00: root bus resource [mem 0x38000000-0x3fffffff 
>> pref]
>> [    0.429670] pci 0000:00:00.0: enabling Extended Tags
>> [    0.453764] pci 0000:00:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.453793] pci 0000:00:00.0: -- IRQ 0
>> [    0.458825] pci 0000:00:00.0: Failed to add - passthrough or MSI/MSI-X 
>> might fail!
>> [    0.471790] pci 0000:01:00.0: -------------------- BUS_NOTIFY_ADD_DEVICE
>>
>> < init_bars >
>>
>> [    0.471821] pci 0000:01:00.0: -- IRQ 255
>> [    0.476809] pci 0000:01:00.0: Failed to add - passthrough or MSI/MSI-X 
>> might fail!
>>
>> < BAR assignments below >
>>
>> [    0.488233] pci 0000:00:00.0: BAR 14: assigned [mem 0xfe200000-0xfe2fffff]
>> [    0.488265] pci 0000:00:00.0: BAR 15: assigned [mem 0x38000000-0x380fffff 
>> pref]
>>
>> In case of x86 this is pretty much ok as BARs are already in place, but for 
>> ARM we
>>
>> need to take care and re-setup vPCI BARs for hwdom.
> Even on x86 there's no guarantee that all devices have their BARs set
> up by firmware.

This is true. But there you could have config space trapped in "x86 generic 
way",

please correct me if I'm wrong here

>
> In a subsequent reply you've suggested to move init_bars from "add" to
> "assign", but I'm having trouble seeing what this would change: It's
> not Dom0 controlling assignment (to itself), but Xen assigns the device
> towards the end of pci_add_device().

PHYSDEVOP_pci_device_add vs XEN_DOMCTL_assign_device

Currently we initialize BARs during PHYSDEVOP_pci_device_add and

if we do that during XEN_DOMCTL_assign_device things seem to change

>
>> Things are getting even more
>>
>> complicated if the host PCI bridge is not ECAM like, so you cannot set 
>> mmio_handlers
>>
>> and trap hwdom's access to the config space to update BARs etc. This is why 
>> I have that
>>
>> ugly hack for rcar_gen3 to read actual BARs for hwdom.
> How to config space accesses work there? The latest for MSI/MSI-X it'll
> be imperative that Xen be able to intercept config space writes.
>
>>>    In order to do passthrough to domUs safely
>>> we will have to add more handlers than what's required for dom0,
>> Can you please tell what are thinking about? What are the missing handlers?
>>>    and
>>> having is_hardware_domain sprinkled in all of them is not a suitable
>>> solution.
>> I'll try to replace is_hardware_domain with something like:
>>
>> +/*
>> + * Detect whether physical PCI devices in this segment belong
>> + * to the domain given, e.g. on x86 all PCI devices live in hwdom,
>> + * but in case of ARM this might not be the case: those may also
>> + * live in driver domains or even Xen itself.
>> + */
>> +bool pci_is_hardware_domain(struct domain *d, u16 seg)
>> +{
>> +#ifdef CONFIG_X86
>> +    return is_hardware_domain(d);
>> +#elif CONFIG_ARM
>> +    return pci_is_owner_domain(d, seg);
>> +#else
>> +#error "Unsupported architecture"
>> +#endif
>> +}
>> +
>> +/*
>> + * Get domain which owns this segment: for x86 this is always hardware
>> + * domain and for ARM this can be different.
>> + */
>> +struct domain *pci_get_hardware_domain(u16 seg)
>> +{
>> +#ifdef CONFIG_X86
>> +    return hardware_domain;
>> +#elif CONFIG_ARM
>> +    return pci_get_owner_domain(seg);
>> +#else
>> +#error "Unsupported architecture"
>> +#endif
>> +}
>>
>> This is what I use to properly detect the domain that really owns physical 
>> host bridge
> I consider this problematic. We should try to not let Arm's and x86'es
> PCI implementations diverge too much, i.e. at least the underlying basic
> model would better be similar. For example, if entire segments can be
> assigned to a driver domain on Arm, why should the same not be possible
> on x86?

Good question, probably in this case x86 == ARM and I can use

pci_is_owner_domain for both architectures instead of using is_hardware_domain 
for x86

>
> Furthermore I'm suspicious about segments being the right granularity
> here. On ia64 multiple host bridges could (and typically would) live
> on segment 0. Iirc I had once seen output from an x86 system which was
> apparently laid out similarly. Therefore, just like for MCFG, I think
> the granularity wants to be bus ranges within a segment.
Can you please suggest something we can use as a hint for such a detection 
logic?
>
> Jan

Thank you,

Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.