[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 02/11] vpci: cancel pending map/unmap on vpci removal



On 18.11.2021 16:21, Oleksandr Andrushchenko wrote:
> On 18.11.21 17:16, Jan Beulich wrote:
>> On 18.11.2021 16:11, Oleksandr Andrushchenko wrote:
>>> On 18.11.21 16:35, Jan Beulich wrote:
>>>> On 18.11.2021 15:14, Oleksandr Andrushchenko wrote:
>>>>> On 18.11.21 16:04, Roger Pau Monné wrote:
>>>>>> Indeed. In the physdevop failure case this comes from an hypercall
>>>>>> context, so maybe you could do the mapping in place using hypercall
>>>>>> continuations if required. Not sure how complex that would be,
>>>>>> compared to just deferring to guest entry point and then dealing with
>>>>>> the possible cleanup on failure.
>>>>> This will solve one part of the equation:
>>>>>
>>>>> pci_physdev_op
>>>>>           pci_add_device
>>>>>               init_bars -> modify_bars -> defer_map -> 
>>>>> raise_softirq(SCHEDULE_SOFTIRQ)
>>>>>           iommu_add_device <- FAILS
>>>>>           vpci_remove_device -> xfree(pdev->vpci)
>>>>>
>>>>> But what about the other one, e.g. vpci_process_pending is triggered in
>>>>> parallel with PCI device de-assign for example?
>>>> Well, that's again in hypercall context, so using hypercall continuations
>>>> may again be an option. Of course at the point a de-assign is initiated,
>>>> you "only" need to drain requests (for that device, but that's unlikely
>>>> to be worthwhile optimizing for), while ensuring no new requests can be
>>>> issued. Again, for the device in question, but here this is relevant -
>>>> a flag may want setting to refuse all further requests. Or maybe the
>>>> register handling hooks may want tearing down before draining pending
>>>> BAR mapping requests; without the hooks in place no new such requests
>>>> can possibly appear.
>>> This can be probably even easier to solve as we were talking about
>>> pausing all vCPUs:
>> I have to admit I'm not sure. It might be easier, but it may also be
>> less desirable.
>>
>>> void vpci_cancel_pending(const struct pci_dev *pdev)
>>> {
>>>       struct domain *d = pdev->domain;
>>>       struct vcpu *v;
>>>       int rc;
>>>
>>>       while ( (rc = domain_pause_except_self(d)) == -ERESTART )
>>>           cpu_relax();
>>>
>>>       if ( rc )
>>>           printk(XENLOG_G_ERR
>>>                  "Failed to pause vCPUs while canceling vPCI map/unmap for 
>>> %pp %pd: %d\n",
>>>                  &pdev->sbdf, pdev->domain, rc);
>>>
>>>       for_each_vcpu ( d, v )
>>>       {
>>>           if ( v->vpci.map_pending && (v->vpci.pdev == pdev) )
>>>
>>> This will prevent all vCPUs to run, but current, thus making it impossible
>>> to run vpci_process_pending in parallel with any hypercall.
>>> So, even without locking in vpci_process_pending the above should
>>> be enough.
>>> The only concern here is that domain_pause_except_self may return
>>> the error code we somehow need to handle...
>> Not just this. The -ERESTART handling isn't appropriate this way
>> either.
> Are you talking about cpu_relax()?

I'm talking about that spin-waiting loop as a whole.

>>   For the moment I can't help thinking that draining would
>> be preferable over canceling.
> Given that cancellation is going to happen on error path or
> on device de-assign/remove I think this can be acceptable.
> Any reason why not?

It would seem to me that the correctness of a draining approach is
going to be easier to prove than that of a canceling one, where I
expect races to be a bigger risk. Especially something that gets
executed infrequently, if ever (error paths in particular), knowing
things are well from testing isn't typically possible.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.