[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/9] vpci/header: Implement guest BAR register handlers


  • To: Jan Beulich <jbeulich@xxxxxxxx>, Oleksandr Andrushchenko <andr2000@xxxxxxxxx>
  • From: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@xxxxxxxx>
  • Date: Wed, 8 Sep 2021 15:14:46 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=jLpf7T2GS3Z0c/y8shA8fr/NAWe8QNcklPFtyAfEp40=; b=oWlz2kiRzhTOWEUfQhjRy6uvSd6kpe97o4a4n/oTx2B6j2BkjwbDCP2/ZuGehOKbS7sF2WCtsc38hnQI07k+jckRb9JkmEFBqyE6rf/J9BA7Gmy7PPeH2h4NwJHD5sS1NScyCAvjksfNrTs4zF9M4gSi6qY5HOQ99e0R0hH/7wiO5XZMzuaWkBtMGU1QuP+/Ezg2SmoC9kpjBtmoAeVg+fy4Iit0C0IgE5WkYZA5SaSVV6IB54g6HPeOGoMFMfuwDIeWIVTw20X4vi9snBQmuI+1HJRwTfgqzZ1HnNJQZ6LPBgFuNuUBOAo+jlhmqi1lM4DVT9y94s3/uUxOxk4Bhw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cezgYwNokEuP8IU1WrNp/SpeWHd0ADJdfhLuqqRHvEwFdGsda/MotUEgKYQhbCHzV9WdMbj4ns8W26X3dcvAM7CwN0naeOx+LUjHKHOg08uDyDFhIlFjSuEBHGoiY96Iq6OqDQwtqtf0AwqzdHhp+7mAxwRFRWKOxZ7t52T2Pxhlb5CqXFt/TGXMNe23i8P90bkstwq0b3DdCfK6bJkA04LQsUTdQA9e3AVUK9hYTThmftnuXVuSXqMy7r553X+MSJtRQWS8+gmJLHt+6m9fu3Q/VSaC6iY1pUIphEOVLMOgUz8i4TcvgNN12ocXOIrM3Xyo7CxvCRikL55dFaGJYA==
  • Authentication-results: suse.com; dkim=none (message not signed) header.d=none;suse.com; dmarc=none action=none header.from=epam.com;
  • Cc: "julien@xxxxxxx" <julien@xxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <Oleksandr_Tyshchenko@xxxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, Artem Mygaiev <Artem_Mygaiev@xxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Rahul Singh <rahul.singh@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 08 Sep 2021 15:15:00 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXoKxmxnIdZosa7ESbvTmPlG6KHauXFn+AgAGCLICAADFtgIAAE0qAgAEIooCAAAS6AIAABV4AgAA6yACAABRDgIAACAOA
  • Thread-topic: [PATCH 5/9] vpci/header: Implement guest BAR register handlers

On 08.09.21 17:46, Jan Beulich wrote:
> On 08.09.2021 15:33, Oleksandr Andrushchenko wrote:
>> On 08.09.21 13:03, Jan Beulich wrote:
>>> On 08.09.2021 11:43, Oleksandr Andrushchenko wrote:
>>>> On 08.09.21 12:27, Jan Beulich wrote:
>>>>> On 07.09.2021 19:39, Oleksandr Andrushchenko wrote:
>>>>>> On 07.09.21 19:30, Jan Beulich wrote:
>>>>>>> On 07.09.2021 15:33, Oleksandr Andrushchenko wrote:
>>>>>>>> On 06.09.21 17:31, Jan Beulich wrote:
>>>>>>>>> On 03.09.2021 12:08, Oleksandr Andrushchenko wrote:
>>>>>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>>>>>> @@ -400,12 +400,72 @@ static void bar_write(const struct pci_dev 
>>>>>>>>>> *pdev, unsigned int reg,
>>>>>>>>>>       static void guest_bar_write(const struct pci_dev *pdev, 
>>>>>>>>>> unsigned int reg,
>>>>>>>>>>                                   uint32_t val, void *data)
>>>>>>>>>>       {
>>>>>>>>>> +    struct vpci_bar *bar = data;
>>>>>>>>>> +    bool hi = false;
>>>>>>>>>> +
>>>>>>>>>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>>>>>>>>>> +    {
>>>>>>>>>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>>>>>>>>>> +        bar--;
>>>>>>>>>> +        hi = true;
>>>>>>>>>> +    }
>>>>>>>>>> +    else
>>>>>>>>>> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
>>>>>>>>>> +    bar->guest_addr &= ~(0xffffffffull << (hi ? 32 : 0));
>>>>>>>>>> +    bar->guest_addr |= (uint64_t)val << (hi ? 32 : 0);
>>>>>>>>> What you store here is not the address that's going to be used,
>>>>>>>>>       as
>>>>>>>>> you don't mask off the low bits (to account for the BAR's size).
>>>>>>>>> When a BAR gets written with all ones, all writable bits get these
>>>>>>>>> ones stored. The address of the BAR, aiui, really changes to
>>>>>>>>> (typically) close below 4Gb (in the case of a 32-bit BAR), which
>>>>>>>>> is why memory / I/O decoding should be off while sizing BARs.
>>>>>>>>> Therefore you shouldn't look for the specific "all writable bits
>>>>>>>>> are ones" pattern (or worse, as you presently do, the "all bits
>>>>>>>>> outside of the type specifier are ones" one) on the read path.
>>>>>>>>> Instead mask the value appropriately here, and simply return back
>>>>>>>>> the stored value from the read path.
>>>> But in case of BAR sizing I need to actually return BAR size.
>>>> So, the comparison is the way to tell if the guest wants to read
>>>> actual (configured) BAR value or it tries to determine BAR's size.
>>>> This is why I compare and use the result as the answer to what needs
>>>> to be supplied to the guest. So, if I don't compare with 0xffffffff for the
>>>> hi part and 0xfffffff0 for the low then how do I know when to return
>>>> configured BAR or return the size?
>>> Well, but that's the common misunderstanding that I've been trying
>>> to point out: There's no difference between these two forms of
>>> reads. The BARs are simply registers with some r/o bits. There's
>>> no hidden 2nd register recording what was last written. When you
>>> write 0xffffffff, all you do is set all writable bits to 1. When
>>> you read back from the register, you will find all r/o bits
>>> unchanged (which in particular means all lower address bits are
>>> zero, thus allowing you to determine the size).
>>>
>>> When the spec says to write 0xffffffff for sizing purposes, OSes
>>> should follow that, yes. This doesn't preclude them to use other
>>> forms of writes for whichever purpose. Hence you do not want to
>>> special case sizing, but instead you want to emulate correctly
>>> all forms of writes, including subsequent reads to uniformly
>>> return the intended / expected values.
>>>
>>> Just to give an example (perhaps a little contrived): To size a
>>> 64-bit BAR, in principle you'd first need to write 0xffffffff to
>>> both halves. But there's nothing wrong with doing this in a
>>> different order: Act on the low half alone first, and then act
>>> on the high half. The acting on the high half could even be
>>> skipped if the low half sizing produced at least bit 31 set. Now
>>> if you were to special case seeing ffffffff:fffffff? as the
>>> last written pair of values, you'd break that (imo legitimate)
>>> alternative process of sizing.
>> How about:
> Yes, that's what I was after. Just one nit right away:
>
>> static void guest_bar_write(const struct pci_dev *pdev, unsigned int reg,
>>                               uint32_t val, void *data)
>> {
>>       struct vpci_bar *bar = data;
>>       bool hi = false;
>>
>>       if ( bar->type == VPCI_BAR_MEM64_HI )
>>       {
>>           ASSERT(reg > PCI_BASE_ADDRESS_0);
>>           bar--;
>>           hi = true;
>>       }
>>       else
>>       {
>>           val &= PCI_BASE_ADDRESS_MEM_MASK;
>>           val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32
>>                                              : PCI_BASE_ADDRESS_MEM_TYPE_64;
>>           val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
>>       }
>>
>>       bar->guest_addr &= ~(0xffffffffull << (hi ? 32 : 0));

Do you think this needs to be 0xfffffffful, not 0xffffffffull?

e.g. s/ull/ul

>>       bar->guest_addr |= (uint64_t)val << (hi ? 32 : 0);
>>
>>       bar->guest_addr &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;
>> }
>>
>> static uint32_t guest_bar_read(const struct pci_dev *pdev, unsigned int reg,
>>                                  void *data)
>> {
>>       struct vpci_bar *bar = data;
> const please.
Sure
>
> Jan
>
Thank you for helping with this!!

Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.