[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v6 05/13] vpci: add hooks for PCI device assign/de-assign




On 08.02.22 12:29, Jan Beulich wrote:
> On 08.02.2022 11:22, Oleksandr Andrushchenko wrote:
>>
>> On 08.02.22 12:09, Jan Beulich wrote:
>>> On 08.02.2022 10:55, Oleksandr Andrushchenko wrote:
>>>> On 08.02.22 11:44, Jan Beulich wrote:
>>>>> On 08.02.2022 10:27, Oleksandr Andrushchenko wrote:
>>>>>> On 08.02.22 11:13, Jan Beulich wrote:
>>>>>>> On 08.02.2022 09:32, Oleksandr Andrushchenko wrote:
>>>>>>>> On 07.02.22 18:28, Jan Beulich wrote:
>>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>>>> @@ -1507,6 +1511,8 @@ static int assign_device(struct domain *d, u16 
>>>>>>>>>> seg, u8 bus, u8 devfn, u32 flag)
>>>>>>>>>>                               pci_to_dev(pdev), flag);
>>>>>>>>>>           }
>>>>>>>>>>       
>>>>>>>>>> +    rc = vpci_assign_device(d, pdev);
>>>>>>>>>> +
>>>>>>>>>>        done:
>>>>>>>>>>           if ( rc )
>>>>>>>>>>               printk(XENLOG_G_WARNING "%pd: assign (%pp) failed 
>>>>>>>>>> (%d)\n",
>>>>>>>>> There's no attempt to undo anything in the case of getting back an
>>>>>>>>> error. ISTR this being deemed okay on the basis that the tool stack
>>>>>>>>> would then take whatever action, but whatever it is that is supposed
>>>>>>>>> to deal with errors here wants spelling out in the description.
>>>>>>>> Why? I don't change the previously expected decision and implementation
>>>>>>>> of the assign_device function: I use error paths as they were used 
>>>>>>>> before
>>>>>>>> for the existing code. So, I see no clear reason to stress that the 
>>>>>>>> existing
>>>>>>>> and new code relies on the toolstack
>>>>>>> Saying half a sentence on this is helping review.
>>>>>> Ok
>>>>>>>>> What's important is that no caller up the call tree may be left with
>>>>>>>>> the impression that the device is still owned by the original
>>>>>>>>> domain. With how you have it, the device is going to be owned by the
>>>>>>>>> new domain, but not really usable.
>>>>>>>> This is not true: vpci_assign_device will call vpci_deassign_device
>>>>>>>> internally if it fails. So, the device won't be assigned in this case
>>>>>>> No. The device is assigned to whatever pdev->domain holds. Calling
>>>>>>> vpci_deassign_device() there merely makes sure that the device will
>>>>>>> have _no_ vPCI data and hooks in place, rather than something
>>>>>>> partial.
>>>>>> So, this patch is only dealing with vpci assign/de-assign
>>>>>> And it rolls back what it did in case of a failure
>>>>>> It also returns rc in assign_device to signal it has failed
>>>>>> What else is expected from this patch??
>>>>> Until now if assign_device() returns an error, this tells the caller
>>>>> that the device did not change ownership;
>>>> Not sure this is the case:
>>>>        if ( (rc = iommu_call(hd->platform_ops, assign_device, d, devfn,
>>>>                              pci_to_dev(pdev), flag)) )
>>>> iommu_call can leave the new ownership even now without
>>>> vpci_assign_device.
>>> Did you check the actual hook functions for when exactly the ownership
>>> change happens. For both VT-d and AMD it is the last thing they do,
>>> when no error can occur anymore.
>> This functionality does not exist for Arm yet, so this is up to the
>> future series to add that.
>>
>> WRT to the existing code:
>>
>> static int amd_iommu_assign_device(struct domain *d, u8 devfn,
>>                                      struct pci_dev *pdev,
>>                                      u32 flag)
>> {
>>       if ( !rc )
>>           rc = reassign_device(pdev->domain, d, devfn, pdev); <<<<< this 
>> will set pdev->domain
>>
>>       if ( rc && !is_hardware_domain(d) )
>>       {
>>           int ret = amd_iommu_reserve_domain_unity_unmap(
>>                         d, ivrs_mappings[req_id].unity_map);
>>
>>           if ( ret )
>>           {
>>               printk(XENLOG_ERR "AMD-Vi: "
>>                      "unity-unmap for %pd/%04x:%02x:%02x.%u failed (%d)\n",
>>                      d, pdev->seg, pdev->bus,
>>                      PCI_SLOT(devfn), PCI_FUNC(devfn), ret);
>>               domain_crash(d);
>>           }
>> So....
>>
>> This is IMO wrong in the first place to let IOMMU code assign pdev->domain.
>> This is something that needs to be done by the PCI code itself and
>> not relying on each IOMMU callback implementation
>>>    My understanding is that the roll-back is
>>>> expected to be performed by the toolstack and vpci_assign_device
>>>> doesn't prevent that by returning rc. Even more, before we discussed
>>>> that it would be good for vpci_assign_device to try recovering from
>>>> a possible error early which is done by calling vpci_deassign_device
>>>> internally.
>>> Yes, but that's only part of it. It at least needs considering what
>>> effects have resulted from operations prior to vpci_assign_device().
>> Taking into account the code snippet above: what is your expectation
>> from this patch with this respect?
> You did note the domain_crash() in there, didn't you?
Which is AMD specific implementation which can be different for
other IOMMUs. Yes, I did.
> The snippet above
> still matches the "device not assigned to an alive DomU" criteria (which
> can be translated to "no exposure of a device to an untrusted entity in
> case of error"). Such domain_crash() uses aren't nice, and I'd prefer to
> see them go away, but said property needs to be retained with any
> alternative solutions.
This smells like we first need to fix the existing code, so
pdev->domain is not assigned by specific IOMMU implementations,
but instead controlled by the code which relies on that, assign_device.

I can have something like:

diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
index 88836aab6baf..cc7790709a50 100644
--- a/xen/drivers/passthrough/pci.c
+++ b/xen/drivers/passthrough/pci.c
@@ -1475,6 +1475,7 @@ static int device_assigned(u16 seg, u8 bus, u8 devfn)
  static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 
flag)
  {
      const struct domain_iommu *hd = dom_iommu(d);
+    struct domain *old_owner;
      struct pci_dev *pdev;
      int rc = 0;

@@ -1490,6 +1491,9 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)
      ASSERT(pdev && (pdev->domain == hardware_domain ||
                      pdev->domain == dom_io));

+    /* We need to restore the old owner in case of an error. */
+    old_owner = pdev->domain;
+
      vpci_deassign_device(pdev->domain, pdev);

      rc = pdev_msix_assign(d, pdev);
@@ -1515,8 +1519,12 @@ static int assign_device(struct domain *d, u16 seg, u8 
bus, u8 devfn, u32 flag)

   done:
      if ( rc )
+    {
          printk(XENLOG_G_WARNING "%pd: assign (%pp) failed (%d)\n",
                 d, &PCI_SBDF3(seg, bus, devfn), rc);
+        /* We failed to assign, so restore the previous owner. */
+        pdev->domain = old_owner;
+    }
      /* The device is assigned to dom_io so mark it as quarantined */
      else if ( d == dom_io )
          pdev->quarantine = true;

But I do not think this belongs to this patch
>
> Jan
>
Thank you,
Oleksandr

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.