[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 05/13] vpci: add hooks for PCI device assign/de-assign




On 08.02.22 12:09, Jan Beulich wrote:
> On 08.02.2022 10:55, Oleksandr Andrushchenko wrote:
>>
>> On 08.02.22 11:44, Jan Beulich wrote:
>>> On 08.02.2022 10:27, Oleksandr Andrushchenko wrote:
>>>> On 08.02.22 11:13, Jan Beulich wrote:
>>>>> On 08.02.2022 09:32, Oleksandr Andrushchenko wrote:
>>>>>> On 07.02.22 18:28, Jan Beulich wrote:
>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>> @@ -1507,6 +1511,8 @@ static int assign_device(struct domain *d, u16 
>>>>>>>> seg, u8 bus, u8 devfn, u32 flag)
>>>>>>>>                              pci_to_dev(pdev), flag);
>>>>>>>>          }
>>>>>>>>      
>>>>>>>> +    rc = vpci_assign_device(d, pdev);
>>>>>>>> +
>>>>>>>>       done:
>>>>>>>>          if ( rc )
>>>>>>>>              printk(XENLOG_G_WARNING "%pd: assign (%pp) failed (%d)\n",
>>>>>>> There's no attempt to undo anything in the case of getting back an
>>>>>>> error. ISTR this being deemed okay on the basis that the tool stack
>>>>>>> would then take whatever action, but whatever it is that is supposed
>>>>>>> to deal with errors here wants spelling out in the description.
>>>>>> Why? I don't change the previously expected decision and implementation
>>>>>> of the assign_device function: I use error paths as they were used before
>>>>>> for the existing code. So, I see no clear reason to stress that the 
>>>>>> existing
>>>>>> and new code relies on the toolstack
>>>>> Saying half a sentence on this is helping review.
>>>> Ok
>>>>>>> What's important is that no caller up the call tree may be left with
>>>>>>> the impression that the device is still owned by the original
>>>>>>> domain. With how you have it, the device is going to be owned by the
>>>>>>> new domain, but not really usable.
>>>>>> This is not true: vpci_assign_device will call vpci_deassign_device
>>>>>> internally if it fails. So, the device won't be assigned in this case
>>>>> No. The device is assigned to whatever pdev->domain holds. Calling
>>>>> vpci_deassign_device() there merely makes sure that the device will
>>>>> have _no_ vPCI data and hooks in place, rather than something
>>>>> partial.
>>>> So, this patch is only dealing with vpci assign/de-assign
>>>> And it rolls back what it did in case of a failure
>>>> It also returns rc in assign_device to signal it has failed
>>>> What else is expected from this patch??
>>> Until now if assign_device() returns an error, this tells the caller
>>> that the device did not change ownership;
>> Not sure this is the case:
>>       if ( (rc = iommu_call(hd->platform_ops, assign_device, d, devfn,
>>                             pci_to_dev(pdev), flag)) )
>> iommu_call can leave the new ownership even now without
>> vpci_assign_device.
> Did you check the actual hook functions for when exactly the ownership
> change happens. For both VT-d and AMD it is the last thing they do,
> when no error can occur anymore.
This functionality does not exist for Arm yet, so this is up to the
future series to add that.

WRT to the existing code:

static int amd_iommu_assign_device(struct domain *d, u8 devfn,
                                    struct pci_dev *pdev,
                                    u32 flag)
{
     if ( !rc )
         rc = reassign_device(pdev->domain, d, devfn, pdev); <<<<< this will 
set pdev->domain

     if ( rc && !is_hardware_domain(d) )
     {
         int ret = amd_iommu_reserve_domain_unity_unmap(
                       d, ivrs_mappings[req_id].unity_map);

         if ( ret )
         {
             printk(XENLOG_ERR "AMD-Vi: "
                    "unity-unmap for %pd/%04x:%02x:%02x.%u failed (%d)\n",
                    d, pdev->seg, pdev->bus,
                    PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
             domain_crash(d);
         }
So....

This is IMO wrong in the first place to let IOMMU code assign pdev->domain.
This is something that needs to be done by the PCI code itself and
not relying on each IOMMU callback implementation
>
>   My understanding is that the roll-back is
>> expected to be performed by the toolstack and vpci_assign_device
>> doesn't prevent that by returning rc. Even more, before we discussed
>> that it would be good for vpci_assign_device to try recovering from
>> a possible error early which is done by calling vpci_deassign_device
>> internally.
> Yes, but that's only part of it. It at least needs considering what
> effects have resulted from operations prior to vpci_assign_device().
Taking into account the code snippet above: what is your expectation
from this patch with this respect?

>
> Jan
>
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.