[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Broken PCI device passthrough, after XSA-302 fix?



On Mon, Jan 06, 2020 at 12:18:31PM +0100, Jan Beulich wrote:
> On 04.01.2020 02:07, Marek Marczykowski-Górecki  wrote:
> > I have a multi-function PCI device, behind a PCI bridge, that normally
> > I assign to a single domain. But now it fails with:
> > 
> > (XEN) [VT-D]d14: 0000:04:00.0 owned by d0!<G><0>assign 0000:05:00.0 to 
> > dom14 failed (-22)
> 
> Is this on the 1st attempt, or after the device had already been
> assigned to some (same or other) guest? After quite a bit of
> staring at the code I can't seem to be able to spot a difference
> in behavior for the 1st attempt, but you not saying explicitly
> that it would only happen on subsequent ones makes me assume you
> run into the issue right away.

Yes, it was the first try.

> > This is Xen 4.8.5 + XSA patches. It started happening after some update
> > during last few months, not really sure which one.
> 
> Having a smaller window would of course help, as would ...

The working version was just before XSAs of 2019-10-31  (which include
XSA-302).
But at this point, I'm not sure if no other configuration changes were
made (see below).

> > I guess it is because quarantine feature, so initial ownership of
> > 0000:05:00.0 is different than the bridge it is connected to.
> > I'm not sure if relevant for this case, but I also set
> > pcidev->rdm_policy = LIBXL_RDM_RESERVE_POLICY_RELAXED.
> > 
> > Booting with iommu=no-quarantine helps. Note I do not use `xl
> > pci-assignable-add` command, only bind the device to the pciback driver
> > in dom0.
> 
> ... knowing whether behavior differs when using this preparatory
> step.

xl pci-assignable-add doesn't make a difference with XSA-306 applied.
But I've tried xl pci-assignable-remove with interesting result:
It succeeded for 0000:05:00.0 and 0000:05:00.2, but failed for
0000:05:00.1 with this message:

(XEN) [VT-D]d0: 0000:05:00.1 owned by d32753!<G><0>deassign 0000:05:00.1
from dom32753 failed (-22)

Anyway, I think my previous testing was inaccurate:
Looks like the issue is caused by me failing to set rdm_policy, contrary
to the above message. I get the above error only without
LIBXL_RDM_RESERVE_POLICY_RELAXED set. When I set it properly, domain
starts even without iommu=no-quarantine. I still have some issues with
the device within the domain, but not sure if relevant to this or
something else.

Does it make sense now?
Is the patch from your other message still relevant?

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.