[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal



On 16.11.2021 08:32, Oleksandr Andrushchenko wrote:
> On 15.11.21 18:56, Jan Beulich wrote:
>> On 05.11.2021 07:56, Oleksandr Andrushchenko wrote:
>>> From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx>
>>>
>>> When a vPCI is removed for a PCI device it is possible that we have
>>> scheduled a delayed work for map/unmap operations for that device.
>>> For example, the following scenario can illustrate the problem:
>>>
>>> pci_physdev_op
>>>     pci_add_device
>>>         init_bars -> modify_bars -> defer_map -> 
>>> raise_softirq(SCHEDULE_SOFTIRQ)
>>>     iommu_add_device <- FAILS
>>>     vpci_remove_device -> xfree(pdev->vpci)
>>>
>>> leave_hypervisor_to_guest
>>>     vpci_process_pending: v->vpci.mem != NULL; v->vpci.pdev->vpci == NULL
>>>
>>> For the hardware domain we continue execution as the worse that
>>> could happen is that MMIO mappings are left in place when the
>>> device has been deassigned
>> Is continuing safe in this case? I.e. isn't there the risk of a NULL
>> deref?
> I think it is safe to continue

And why do you think so? I.e. why is there no race for Dom0 when there
is one for DomU?

>>> For unprivileged domains that get a failure in the middle of a vPCI
>>> {un}map operation we need to destroy them, as we don't know in which
>>> state the p2m is. This can only happen in vpci_process_pending for
>>> DomUs as they won't be allowed to call pci_add_device.
>> You saying "we need to destroy them" made me look for a new domain_crash()
>> that you add, but there is none. What is this about?
> Yes, I guess we need to implicitly destroy the domain,

What do you mean by "implicitly"?

>>> @@ -165,6 +164,18 @@ bool vpci_process_pending(struct vcpu *v)
>>>       return false;
>>>   }
>>>   
>>> +void vpci_cancel_pending(const struct pci_dev *pdev)
>>> +{
>>> +    struct vcpu *v = current;
>>> +
>>> +    /* Cancel any pending work now. */
>> Doesn't "any" include pending work on all vCPU-s of the guest, not
>> just current? Is current even relevant (as in: part of the correct
>> domain), considering ...
>>
>>> --- a/xen/drivers/vpci/vpci.c
>>> +++ b/xen/drivers/vpci/vpci.c
>>> @@ -51,6 +51,8 @@ void vpci_remove_device(struct pci_dev *pdev)
>>>           xfree(r);
>>>       }
>>>       spin_unlock(&pdev->vpci->lock);
>>> +
>>> +    vpci_cancel_pending(pdev);
>> ... this code path, when coming here from pci_{add,remove}_device()?
>>
>> I can agree that there's a problem here, but I think you need to
>> properly (i.e. in a race free manner) drain pending work.
> Yes, the code is inconsistent with this respect. I am thinking about:
> 
> void vpci_cancel_pending(const struct pci_dev *pdev)
> {
>      struct domain *d = pdev->domain;
>      struct vcpu *v;
> 
>      /* Cancel any pending work now. */
>      domain_lock(d);
>      for_each_vcpu ( d, v )
>      {
>          vcpu_pause(v);
>          if ( v->vpci.mem && v->vpci.pdev == pdev)

Nit: Same style issue as in the original patch.

>          {
>              rangeset_destroy(v->vpci.mem);
>              v->vpci.mem = NULL;
>          }
>          vcpu_unpause(v);
>      }
>      domain_unlock(d);
> }
> 
> which seems to solve all the concerns. Is this what you mean?

Something along these lines. I expect you will want to make use of
domain_pause_except_self(), and I don't understand the purpose of
acquiring the domain lock.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.