[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 06/14] vpci/header: implement guest BAR register handlers



Hi, Roger!

On 12.01.22 14:35, Roger Pau Monné wrote:
> On Thu, Nov 25, 2021 at 01:02:43PM +0200, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>
>> Add relevant vpci register handlers when assigning PCI device to a domain
>> and remove those when de-assigning. This allows having different
>> handlers for different domains, e.g. hwdom and other guests.
>>
>> Emulate guest BAR register values: this allows creating a guest view
>> of the registers and emulates size and properties probe as it is done
>> during PCI device enumeration by the guest.
>>
>> ROM BAR is only handled for the hardware domain and for guest domains
>> there is a stub: at the moment PCI expansion ROM handling is supported
>> for x86 only and it might not be used by other architectures without
>> emulating x86. Other use-cases may include using that expansion ROM before
>> Xen boots, hence no emulation is needed in Xen itself. Or when a guest
>> wants to use the ROM code which seems to be rare.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>> ---
>> Since v4:
>> - updated commit message
>> - s/guest_addr/guest_reg
>> Since v3:
>> - squashed two patches: dynamic add/remove handlers and guest BAR
>>    handler implementation
>> - fix guest BAR read of the high part of a 64bit BAR (Roger)
>> - add error handling to vpci_assign_device
>> - s/dom%pd/%pd
>> - blank line before return
>> Since v2:
>> - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
>>    has been eliminated from being built on x86
>> Since v1:
>>   - constify struct pci_dev where possible
>>   - do not open code is_system_domain()
>>   - simplify some code3. simplify
>>   - use gdprintk + error code instead of gprintk
>>   - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
>>     so these do not get compiled for x86
>>   - removed unneeded is_system_domain check
>>   - re-work guest read/write to be much simpler and do more work on write
>>     than read which is expected to be called more frequently
>>   - removed one too obvious comment
>> ---
>>   xen/drivers/vpci/header.c | 72 +++++++++++++++++++++++++++++++++++----
>>   xen/include/xen/vpci.h    |  3 ++
>>   2 files changed, 69 insertions(+), 6 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index ba333fb2f9b0..8880d34ebf8e 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -433,6 +433,48 @@ static void bar_write(const struct pci_dev *pdev, 
>> unsigned int reg,
>>       pci_conf_write32(pdev->sbdf, reg, val);
>>   }
>>   
>> +static void guest_bar_write(const struct pci_dev *pdev, unsigned int reg,
>> +                            uint32_t val, void *data)
>> +{
>> +    struct vpci_bar *bar = data;
>> +    bool hi = false;
>> +
>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>> +    {
>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>> +        bar--;
>> +        hi = true;
>> +    }
>> +    else
>> +    {
>> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
>> +        val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32
>> +                                           : PCI_BASE_ADDRESS_MEM_TYPE_64;
>> +        val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
>> +    }
>> +
>> +    bar->guest_reg &= ~(0xffffffffull << (hi ? 32 : 0));
>> +    bar->guest_reg |= (uint64_t)val << (hi ? 32 : 0);
>> +
>> +    bar->guest_reg &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;
>> +}
>> +
>> +static uint32_t guest_bar_read(const struct pci_dev *pdev, unsigned int reg,
>> +                               void *data)
>> +{
>> +    const struct vpci_bar *bar = data;
>> +    bool hi = false;
>> +
>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>> +    {
>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>> +        bar--;
>> +        hi = true;
>> +    }
>> +
>> +    return bar->guest_reg >> (hi ? 32 : 0);
>> +}
>> +
>>   static void rom_write(const struct pci_dev *pdev, unsigned int reg,
>>                         uint32_t val, void *data)
>>   {
>> @@ -481,6 +523,17 @@ static void rom_write(const struct pci_dev *pdev, 
>> unsigned int reg,
>>           rom->addr = val & PCI_ROM_ADDRESS_MASK;
>>   }
>>   
>> +static void guest_rom_write(const struct pci_dev *pdev, unsigned int reg,
>> +                            uint32_t val, void *data)
>> +{
>> +}
>> +
>> +static uint32_t guest_rom_read(const struct pci_dev *pdev, unsigned int reg,
>> +                               void *data)
>> +{
>> +    return 0xffffffff;
>> +}
> There should be no need for those handlers. As said elsewhere: for
> guests registers not explicitly handled should return ~0 for reads and
> drop writes, which is what you are proposing here.
Yes, you are right: I can see in vpci_read that we end up reading ~0 if no
handler exists (which is what I do here with guest_rom_read). But I am not that
sure about the dropped writes:

void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
                 uint32_t data)
{
     unsigned int data_offset = 0;

[snip]

     if ( data_offset < size )
         /* Tailing gap, write the remaining. */
         vpci_write_hw(sbdf, reg + data_offset, size - data_offset,
                       data >> (data_offset * 8));

so it looks like for the un-handled writes we still reach the HW register.
Could you please tell if the code above needs improvement (like checking
if the write was handled) or I still need to provide a write handler, e.g.
guest_rom_write here?
>> +
>>   static int init_bars(struct pci_dev *pdev)
>>   {
>>       uint16_t cmd;
>> @@ -489,6 +542,7 @@ static int init_bars(struct pci_dev *pdev)
>>       struct vpci_header *header = &pdev->vpci->header;
>>       struct vpci_bar *bars = header->bars;
>>       int rc;
>> +    bool is_hwdom = is_hardware_domain(pdev->domain);
>>   
>>       switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
>>       {
>> @@ -528,8 +582,10 @@ static int init_bars(struct pci_dev *pdev)
>>           if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
>>           {
>>               bars[i].type = VPCI_BAR_MEM64_HI;
>> -            rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, 
>> reg,
>> -                                   4, &bars[i]);
>> +            rc = vpci_add_register(pdev->vpci,
>> +                                   is_hwdom ? vpci_hw_read32 : 
>> guest_bar_read,
>> +                                   is_hwdom ? bar_write : guest_bar_write,
>> +                                   reg, 4, &bars[i]);
>>               if ( rc )
>>               {
>>                   pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>> @@ -569,8 +625,10 @@ static int init_bars(struct pci_dev *pdev)
>>           bars[i].size = size;
>>           bars[i].prefetchable = val & PCI_BASE_ADDRESS_MEM_PREFETCH;
>>   
>> -        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, reg, 
>> 4,
>> -                               &bars[i]);
>> +        rc = vpci_add_register(pdev->vpci,
>> +                               is_hwdom ? vpci_hw_read32 : guest_bar_read,
>> +                               is_hwdom ? bar_write : guest_bar_write,
>> +                               reg, 4, &bars[i]);
>>           if ( rc )
>>           {
>>               pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>> @@ -590,8 +648,10 @@ static int init_bars(struct pci_dev *pdev)
>>           header->rom_enabled = pci_conf_read32(pdev->sbdf, rom_reg) &
>>                                 PCI_ROM_ADDRESS_ENABLE;
>>   
>> -        rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rom_write, 
>> rom_reg,
>> -                               4, rom);
>> +        rc = vpci_add_register(pdev->vpci,
>> +                               is_hwdom ? vpci_hw_read32 : guest_rom_read,
>> +                               is_hwdom ? rom_write : guest_rom_write,
>> +                               rom_reg, 4, rom);
> This whole call should be made conditional to is_hwdom, as said above
> there's no need for the guest_rom handlers.
Yes, if writes are indeed dropped, please see question above
>
> Likewise I assume you expect IO BARs to simply return ~0 and drop
> writes, as there's no explicit handler added for those?
Yes, but that was not my intention: I simply didn't handle IO BARs
and now we do need that handling: either with the default behavior
for the unhandled read/write (drop writes, read ~0) or by introducing
the handlers. I hope we can rely on the "unhandled read/write" and
get what we want
>
>>           if ( rc )
>>               rom->type = VPCI_BAR_EMPTY;
>>       }
>> diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h
>> index ed127a08a953..0a73b14a92dc 100644
>> --- a/xen/include/xen/vpci.h
>> +++ b/xen/include/xen/vpci.h
>> @@ -68,7 +68,10 @@ struct vpci {
>>       struct vpci_header {
>>           /* Information about the PCI BARs of this device. */
>>           struct vpci_bar {
>> +            /* Physical view of the BAR. */
> No, that's not the physical view, it's the physical (host) address.
Ok
>
>>               uint64_t addr;
>> +            /* Guest view of the BAR: address and lower bits. */
>> +            uint64_t guest_reg;
> I continue to think it would be clearer if you store the guest address
> here (gaddr, without the low bits) and add those in guest_bar_read
> based on bar->{type,prefetchable}. Then it would be equivalent to the
> existing 'addr' field.
Ok, I'll re-work the code with this approach in mind: s/guest_reg/gaddr +
required code to handle that
>
> I wonder whether we need to protect the added code with
> CONFIG_HAS_VPCI_GUEST_SUPPORT, this would effectively be dead code
> otherwise. Long term I don't think we wish to differentiate between
> dom0 and domU vPCI support at build time, so I'm unsure whether it's
> helpful to pollute the code with CONFIG_HAS_VPCI_GUEST_SUPPORT when
> the plan is to remove those long term.
I would have it without CONFIG_HAS_VPCI_GUEST_SUPPORT if you
don't mind
>
> Thanks, Roger.
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.