[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 06/13] vpci/header: implement guest BAR register handlers




On 08.02.22 11:25, Roger Pau Monné wrote:
> On Fri, Feb 04, 2022 at 08:34:52AM +0200, Oleksandr Andrushchenko wrote:
>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>
>> Add relevant vpci register handlers when assigning PCI device to a domain
>> and remove those when de-assigning. This allows having different
>> handlers for different domains, e.g. hwdom and other guests.
>>
>> Emulate guest BAR register values: this allows creating a guest view
>> of the registers and emulates size and properties probe as it is done
>> during PCI device enumeration by the guest.
>>
>> All empty, IO and ROM BARs for guests are emulated by returning 0 on
>> reads and ignoring writes: this BARs are special with this respect as
>> their lower bits have special meaning, so returning default ~0 on read
>> may confuse guest OS.
>>
>> Memory decoding is initially disabled when used by guests in order to
>> prevent the BAR being placed on top of a RAM region.
>>
>> Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>> ---
>> Since v5:
>> - make sure that the guest set address has the same page offset
>>    as the physical address on the host
>> - remove guest_rom_{read|write} as those just implement the default
>>    behaviour of the registers not being handled
>> - adjusted comment for struct vpci.addr field
>> - add guest handlers for BARs which are not handled and will otherwise
>>    return ~0 on read and ignore writes. The BARs are special with this
>>    respect as their lower bits have special meaning, so returning ~0
>>    doesn't seem to be right
>> Since v4:
>> - updated commit message
>> - s/guest_addr/guest_reg
>> Since v3:
>> - squashed two patches: dynamic add/remove handlers and guest BAR
>>    handler implementation
>> - fix guest BAR read of the high part of a 64bit BAR (Roger)
>> - add error handling to vpci_assign_device
>> - s/dom%pd/%pd
>> - blank line before return
>> Since v2:
>> - remove unneeded ifdefs for CONFIG_HAS_VPCI_GUEST_SUPPORT as more code
>>    has been eliminated from being built on x86
>> Since v1:
>>   - constify struct pci_dev where possible
>>   - do not open code is_system_domain()
>>   - simplify some code3. simplify
>>   - use gdprintk + error code instead of gprintk
>>   - gate vpci_bar_{add|remove}_handlers with CONFIG_HAS_VPCI_GUEST_SUPPORT,
>>     so these do not get compiled for x86
>>   - removed unneeded is_system_domain check
>>   - re-work guest read/write to be much simpler and do more work on write
>>     than read which is expected to be called more frequently
>>   - removed one too obvious comment
>> ---
>>   xen/drivers/vpci/header.c | 131 +++++++++++++++++++++++++++++++++-----
>>   xen/include/xen/vpci.h    |   3 +
>>   2 files changed, 118 insertions(+), 16 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index bd23c0274d48..2620a95ff35b 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -406,6 +406,81 @@ static void bar_write(const struct pci_dev *pdev, 
>> unsigned int reg,
>>       pci_conf_write32(pdev->sbdf, reg, val);
>>   }
>>   
>> +static void guest_bar_write(const struct pci_dev *pdev, unsigned int reg,
>> +                            uint32_t val, void *data)
>> +{
>> +    struct vpci_bar *bar = data;
>> +    bool hi = false;
>> +    uint64_t guest_reg = bar->guest_reg;
>> +
>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>> +    {
>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>> +        bar--;
>> +        hi = true;
>> +    }
>> +    else
>> +    {
>> +        val &= PCI_BASE_ADDRESS_MEM_MASK;
>> +        val |= bar->type == VPCI_BAR_MEM32 ? PCI_BASE_ADDRESS_MEM_TYPE_32
>> +                                           : PCI_BASE_ADDRESS_MEM_TYPE_64;
>> +        val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0;
>> +    }
>> +
>> +    guest_reg &= ~(0xffffffffull << (hi ? 32 : 0));
>> +    guest_reg |= (uint64_t)val << (hi ? 32 : 0);
>> +
>> +    guest_reg &= ~(bar->size - 1) | ~PCI_BASE_ADDRESS_MEM_MASK;
>> +
>> +    /*
>> +     * Make sure that the guest set address has the same page offset
>> +     * as the physical address on the host or otherwise things won't work as
>> +     * expected.
>> +     */
>> +    if ( (guest_reg & (~PAGE_MASK & PCI_BASE_ADDRESS_MEM_MASK)) !=
>> +         (bar->addr & ~PAGE_MASK) )
> This is only required when !hi, but I'm fine with doing it
> unconditionally as it's clearer.
This is correct wrt hi
>
>> +    {
>> +        gprintk(XENLOG_WARNING,
>> +                "%pp: ignored BAR %zu write with wrong page offset\n",
> "%pp: ignored BAR %zu write attempting to change page offset\n"
Ok
>
>> +                &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
>> +        return;
>> +    }
>> +
>> +    bar->guest_reg = guest_reg;
>> +}
>> +
>> +static uint32_t guest_bar_read(const struct pci_dev *pdev, unsigned int reg,
>> +                               void *data)
>> +{
>> +    const struct vpci_bar *bar = data;
>> +    bool hi = false;
>> +
>> +    if ( bar->type == VPCI_BAR_MEM64_HI )
>> +    {
>> +        ASSERT(reg > PCI_BASE_ADDRESS_0);
>> +        bar--;
>> +        hi = true;
>> +    }
>> +
>> +    return bar->guest_reg >> (hi ? 32 : 0);
>> +}
>> +
>> +static uint32_t guest_bar_ignore_read(const struct pci_dev *pdev,
>> +                                      unsigned int reg, void *data)
>> +{
>> +    return 0;
>> +}
>> +
>> +static int bar_ignore_access(const struct pci_dev *pdev, unsigned int reg,
>> +                             struct vpci_bar *bar)
>> +{
>> +    if ( is_hardware_domain(pdev->domain) )
>> +        return 0;
>> +
>> +    return vpci_add_register(pdev->vpci, guest_bar_ignore_read, NULL,
>> +                             reg, 4, bar);
>> +}
>> +
>>   static void rom_write(const struct pci_dev *pdev, unsigned int reg,
>>                         uint32_t val, void *data)
>>   {
>> @@ -462,6 +537,7 @@ static int init_bars(struct pci_dev *pdev)
>>       struct vpci_header *header = &pdev->vpci->header;
>>       struct vpci_bar *bars = header->bars;
>>       int rc;
>> +    bool is_hwdom = is_hardware_domain(pdev->domain);
>>   
>>       switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f )
>>       {
>> @@ -501,8 +577,10 @@ static int init_bars(struct pci_dev *pdev)
>>           if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
>>           {
>>               bars[i].type = VPCI_BAR_MEM64_HI;
>> -            rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, 
>> reg,
>> -                                   4, &bars[i]);
>> +            rc = vpci_add_register(pdev->vpci,
>> +                                   is_hwdom ? vpci_hw_read32 : 
>> guest_bar_read,
>> +                                   is_hwdom ? bar_write : guest_bar_write,
>> +                                   reg, 4, &bars[i]);
>>               if ( rc )
>>               {
>>                   pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);
>> @@ -516,6 +594,11 @@ static int init_bars(struct pci_dev *pdev)
>>           if ( (val & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO )
>>           {
>>               bars[i].type = VPCI_BAR_IO;
>> +
>> +            rc = bar_ignore_access(pdev, reg, &bars[i]);
> This is wrong: you only want to ignore access to IO BARs for Arm, for
> x86 we should keep the previous behavior. Even more if you go with
> Jan's suggestions to make bar_ignore_access also applicable to dom0.
How do we want this?
#ifdef CONFIG_ARM?
>
>> +            if ( rc )
>> +                return rc;
>> +
>>               continue;
>>           }
>>           if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
>> @@ -535,6 +618,11 @@ static int init_bars(struct pci_dev *pdev)
>>           if ( size == 0 )
>>           {
>>               bars[i].type = VPCI_BAR_EMPTY;
>> +
>> +            rc = bar_ignore_access(pdev, reg, &bars[i]);
>> +            if ( rc )
>> +                return rc;
> I would be fine to just call vpci_add_register here, ie;
>
> if ( !is_hwdom )
> {
>      rc = vpci_add_register(pdev->vpci, guest_bar_ignore_read, NULL,
>                             reg, 4, &bars[i]);
>       if ( rc )
>       {
>           ...
>       }
> }
But we have 3 places where we do the same and also handle errors
the same way. I was thinking having a helper will make the code
clearer. Do you want to open code all the uses?
> Feel free to unify the writing of the PCI_COMMAND register on the
> error path into a label, as then the error case would simply be a
> `goto error;`
I was thinking about it. Will it be ok to make this change in this patch
or you want a dedicated one for that?
> Thanks, Roger.
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.