[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] xen/pt: fix some pass-thru devices don't work across reboot



On Thu, Nov 15, 2018 at 11:40:39AM +0100, Roger Pau Monné wrote:
>On Thu, Nov 15, 2018 at 09:10:26AM +0800, Chao Gao wrote:
>> I find some pass-thru devices don't work any more across guest
>> reboot. Assigning it to another domain also meets the same issue. And
>> the only way to make it work again is un-binding and binding it to
>> pciback. Someone reported this issue one year ago [1].
>
>How does unbinding and binding to pciback fix the issue? Is this due
>to Dom0 using some PV or Dom0 only hypercall that can reset the
>host_maskall state?

I think it is msix_capability_init() that clear host_maskall. And this
function is called by PHYSDEVOP_prepare_msix sub-hypercall.

>
>> 
>> If the device's driver doesn't disable MSI-X during shutdown or qemu is
>> killed/crashed before the domain shutdown, this domain's pirq won't be
>> unmapped. Then xen will unmap all pirq. But pciback has already disabled
>> meory decoding before xen unmapping pirq. Then when Xen is disabling a
>> MSI of the device, it has to sets the host_maskall flag and maskall bit
>> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of
>> this process is:
>> ->arch_domain_destroy
>>     ->free_domain_pirqs
>>         ->unmap_domain_pirq (if pirq isn't unmap by qemu)
>>             ->pirq_guest_force_unbind
>>                 ->__pirq_guest_unbind
>>                     ->mask_msi_irq(=desc->handler->disable())
>>                         ->the warning in msi_set_mask_bit()
>> 
>> The host_maskall bit will prevent guests from clearing the maskall bit
>> even the device is assigned to another guest later. Guests cannot
>> receive interrupts from this device.
>>
>> 
>> To fix this, host_maskall flag is cleared when all MSIs of a device are 
>> freed.
>> It is definitely safely to clear it because no msi is actually set up
>> for this device. Also, 'msix->warned' is initialized to DOMID_INVALID
>> rather than 0 to avoid warnings missing for Dom0.
>> 
>> [1]: 
>> https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html
>> 
>> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
>> ---
>>  xen/arch/x86/msi.c            | 18 ++++++++++++++++++
>>  xen/drivers/passthrough/pci.c |  1 +
>>  2 files changed, 19 insertions(+)
>> 
>> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
>> index 5567990..cd570bc 100644
>> --- a/xen/arch/x86/msi.c
>> +++ b/xen/arch/x86/msi.c
>> @@ -637,6 +637,7 @@ int msi_free_irq(struct msi_desc *entry)
>>  {
>>      unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
>>                        ? entry->msi.nvec : 1;
>> +    struct pci_dev *pdev = entry->dev;
>>  
>>      while ( nr-- )
>>      {
>> @@ -654,6 +655,23 @@ int msi_free_irq(struct msi_desc *entry)
>>  
>>      list_del(&entry->list);
>>      xfree(entry);
>> +
>> +    /*
>> +     * In some cases, the 'host_maskall' is set for safety. Here clear
>> +     * 'host_maskall' if all MSIs are freed for a msi-x capable device.
>> +     * Without it, the device's msix keeps disabled even been reassigned
>
>"... even after being reassigned ..."
>
>I think it's clearer.
>
>> +     * to another domain.
>> +     */
>> +    if ( pdev && list_empty(&pdev->msi_list) && pdev->msix )
>> +    {
>> +        if ( pdev->msix->host_maskall )
>> +            printk(XENLOG_G_WARNING
>> +                   "Resetting msix status of %04x:%02x:%02x.%u\n",
>> +                   pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
>> +                   PCI_FUNC(pdev->devfn));
>> +        pdev->msix->host_maskall = false;
>> +        pdev->msix->warned = DOMID_INVALID;
>> +    }
>>      return 0;
>
>IMO this looks quite similar to a msi_reset_state function or at least
>the start of it.
>
>Maybe it should be in a separate helper that sets all the fields to
>their initial values,

Will do.

>with the expectation that Dom0 will perform the
>device reset?

Dom0 resets devices when (de-)assignment by echo 1 to
/sys/bus/pci/devices/<sbdf>/reset.

>
>The underlying problem here AFAICT is that the Xen internal device
>state is not the same between device assignments.

Yes, it is accurate.

>
>In any case there should be at least a note here pointing out that Xen
>expects the hardware domain to perform a device reset, so the Xen
>internal state actually matches the device state before trying to
>assign the device to another guest.

Sounds good. This issue is that Xen tries to mask msi (when unmapping pirq)
after memory decoding is disabled by pciback. If pciback can unmap all the
pirq-s related a given device before disabling memory decoding, Xen won't meet
this issue. Currently, pciback doesn't maintain the pirq information; it
isn't able to do this.

Anyway, this fix patch can serve as a sanity check.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.