[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/5] xen/vpci: make BAR mapping more resilient for the hardware domain


  • To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
  • Date: Fri, 15 Aug 2025 15:24:10 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=citrix.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jqXfgjrsJCPYSErrVtY/J9xbSb3TkNGVIS0aoEEeO7I=; b=CIoMlvxMPMtc52ShAj6SuWfk1j/FXKu6FEJs5cfJxwUKVIWnDnp0a63PhW+/81OX+nM39ovj4WU90dJgM6iwZagQsyyQh43VCEY309gxK6EnTOfSHfnNpcytRKNZoG/ggMmF2lJcTbiWDhp/RD1G73ZKArdAWIg10oEt3GxwfMbrQomzetDZewdh7+CWUKhGBCoefx6wCCBPWx3ZR9VXwU0nRHXhZCu5Ph0ejlOJvo8qmYpmrkyOipTqcuMabP0SV1YsNhSF5K0ET3oobTKL7nYoH8bYCX1kCeawzLzbxaQvMfUuUYVCIlpWFqtonXJpGN8eRLsdzvciskxLDquGqg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=musyiadMv7Cct85TxZ2ienEB84ZiorNRHQFTr80NJkKG5FgCCT9luvI/x5wyVVvm9gJL80j/TLvLVEuOT/PaeLTKDr0QeK/P2bj1BxEwQTUUxJOSq2LbLXO+AZd0vgFuR6e7wYqOcmK2pK3hgBE82mIWfojrsmiITfTG4ZzOvqLCDTFLOTyOCLUu11AiEbGE2iy06PfYMKtcxW6Di8M0lJZ8x7TjrKquZeX6caMnKv10P6o7XN+ArZiwFkdrpO9PF5Dg0kVtbOveucMvVjXfkPfMsbEE5dfTze8f5ct26UeZTffnZiZWqPx+XgGM/rnPMpajy5TeTxhzYp6BjWy6Bg==
  • Cc: <Jiqian.Chen@xxxxxxx>, <jbeulich@xxxxxxxx>
  • Delivery-date: Fri, 15 Aug 2025 19:24:24 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 8/15/25 14:53, Stewart Hildebrand wrote:
> On 8/14/25 12:03, Roger Pau Monne wrote:
>> The logic in map_range() will bubble up failures to the upper layer, which
>> will result in any remaining regions being skip, and for the non-hardware
>> domain case the owner domain of the device would be destroyed.  However for
>> the hardware domain the intent is to continue execution, hopping the
>> failure to modify the p2m could be worked around by the hardware domain.
>>
>> To accomplish that in a better way, ignore failures and skip the range in
>> that case, possibly continuing to map further ranges.
>>
>> Since the error path in vpci_process_pending() should only be used by domUs
>> now, and it will unconditionally end up calling domain_crash(), simplify
>> it: there's no need to cleanup if the domain will be destroyed.
>>
>> No functional change for domUs intended.
>>
>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> ---
>>  xen/drivers/vpci/header.c | 51 +++++++++++++++++++++------------------
>>  1 file changed, 28 insertions(+), 23 deletions(-)
>>
>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>> index b9756b364300..1035dcca242d 100644
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -64,7 +64,8 @@ static int cf_check map_range(
>>              printk(XENLOG_G_WARNING
>>                     "%pd denied access to MMIO range [%#lx, %#lx]\n",
>>                     map->d, map_mfn, m_end);
>> -            return -EPERM;
>> +            rc = -EPERM;
>> +            goto done;
>>          }
>>  
>>          rc = xsm_iomem_mapping(XSM_HOOK, map->d, map_mfn, m_end, map->map);
>> @@ -73,7 +74,7 @@ static int cf_check map_range(
>>              printk(XENLOG_G_WARNING
>>                     "%pd XSM denied access to MMIO range [%#lx, %#lx]: %d\n",
>>                     map->d, map_mfn, m_end, rc);
>> -            return rc;
>> +            goto done;
>>          }
>>  
>>          /*
>> @@ -87,17 +88,27 @@ static int cf_check map_range(
>>  
>>          rc = map->map ? map_mmio_regions(map->d, _gfn(s), size, 
>> _mfn(map_mfn))
>>                        : unmap_mmio_regions(map->d, _gfn(s), size, 
>> _mfn(map_mfn));
>> -        if ( rc == 0 )
>> -        {
>> -            *c += size;
>> -            break;
>> -        }
>>          if ( rc < 0 )
>>          {
>>              printk(XENLOG_G_WARNING
>>                     "Failed to %smap [%lx %lx] -> [%lx %lx] for %pd: %d\n",
>>                     map->map ? "" : "un", s, e, map_mfn,
>>                     map_mfn + size, map->d, rc);
>> +            goto done;
>> +        }
>> +        if ( rc == 0 )
>> +        {
>> + done:
>> +            if ( is_hardware_domain(map->d) )
>> +            {
>> +                /*
>> +                 * Ignore failures for the hardware domain and skip the 
>> range.
>> +                 * Do it as a best effort workaround to attempt to get the
>> +                 * hardware domain to boot.
>> +                 */
>> +                rc = 0;
> 
> If we return success and size is zero, we will potentially attempt to 
> map/unmap
> the same region again in an infinite loop. rangeset_consume_ranges would 
> invoke
> map_range again directly without returning to vpci_process_pending.

Sorry, I sent the previous email too soon, I see now that it shouldn't be
possible for size to be zero.

> 
>> +                *c += size;
> 
> This line is now only executed for hwdom, but ...
> 
>> +            }
> 
> ... it should go outside of the hwdom check because domUs still need it.
> 
>>              break;
>>          }
>>          ASSERT(rc < size);
>> @@ -213,28 +224,22 @@ bool vpci_process_pending(struct vcpu *v)
>>              return true;
>>          }
>>  
>> -        if ( rc )
>> +        if ( rc && !is_hardware_domain(v->domain) )
>>          {
>> -            spin_lock(&pdev->vpci->lock);
>> -            /* Disable memory decoding unconditionally on failure. */
>> -            modify_decoding(pdev, v->vpci.cmd & ~PCI_COMMAND_MEMORY,
>> -                            false);
> 
> This path could be taken for either map or unmap. Do we still want to disable
> memory decoding in case of unmap?
> 
>> -            spin_unlock(&pdev->vpci->lock);
>> -
>> -            /* Clean all the rangesets */
>> -            for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
>> -                if ( !rangeset_is_empty(header->bars[i].mem) )
>> -                     rangeset_purge(header->bars[i].mem);
>> -
>> -            v->vpci.pdev = NULL;
>> -
>>              read_unlock(&v->domain->pci_lock);
>>  
>> -            if ( !is_hardware_domain(v->domain) )
>> -                domain_crash(v->domain);
>> +            domain_crash(v->domain);
>>  
>>              return false;
>>          }
>> +        ASSERT(!rc);
>> +        /*
>> +         * Purge rangeset to deal with the hardware domain having triggered 
>> an
>> +         * error.  It shouldn't be possible, as map_range() will always 
>> shallow
>> +         * errors for hardware domain owned devices, and
>> +         * rangeset_consume_ranges() itself doesn't generate any errors.
>> +         */
>> +        rangeset_purge(bar->mem);
> 
> Reiterating what was mentioned above: if map_range returned 0 without
> incrementing *c, we won't make it back here.
> 
>>      }
>>      v->vpci.pdev = NULL;
>>  



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.