[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PCI pass-through vs PoD


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 17 Nov 2021 11:13:27 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=g5AV2r+Tz9iMhR9vqiGN1HI9l9TcbDfXOWoBFwMLJwY=; b=mDqcVrPotXDSYezrfbT8Txx/pLhucBeyE6Tp6wh/oj+r276Mq8xYMzQ/A781RaakWHzLINb61pTV7O7cC5nUJLJoKcSceQbLBF5bD16BoJjAVPGyIFrG72XgIcScZ6HJWLT3dDKvm7GOLb9xo4qIiknj2Mp8J8E9X5K+Vr+bJqkStpNAUFVfkQZASO0MON8sAHnwdQSswLOBZxVpROQnihjr8m+DAHI7ytKAmD0ebfymbrQHW/7PfO7SVBey5lC6NVFYzn1tDq+q9XfVbHxsm+9JkhWZIyIaRAOrrbE23HExDNSmNq9f0wYmqRdQm3+5eTRxAp20K3Ub0tTMYp1Ekw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SGTBJyVjlSt0tmKAMAvE8jJagdCjWfUwIxYY5iWSuWsMnJMK4iWQygDFcsVogXFaR3x4ZDWSOr0RF9EhLidxNTlxUXUb+AUTac8jRCfXyAG073h3fba3YEe6n3Rhi/wrXNow3stIfTD042+fNZZY0D+3qxnM6emYZwlZD06w/n/yzzz5oYVsWuIYPl132DqnxDMSVv3HL67rgYxdxWrPfZ14w2p6prkC7YiiP6c0HhF7ZrTsQe6Dkd9MztAPv9Njpe0JfFesbs/e7FBPGz/I7QB3cb5vV/3ZhimKUTu2v3q2cHYwkJUPnhwLueCWiLGJdfnQptIivH3GMXiUgUAzlQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Wed, 17 Nov 2021 10:14:11 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 17.11.2021 09:55, Roger Pau Monné wrote:
> On Wed, Nov 17, 2021 at 09:39:17AM +0100, Jan Beulich wrote:
>> On 13.09.2021 11:02, Jan Beulich wrote:
>>> libxl__domain_config_setdefault() checks whether PoD is going to be
>>> enabled and fails domain creation if at the same time devices would get
>>> assigned. Nevertheless setting up of IOMMU page tables is allowed.
> 
> I'm unsure whether allowing enabling the IOMMU with PoD is the right
> thing to do, at least for our toolstack.

May I ask about the reasons of you being unsure?

>>> However, when later assigning a device to a domain which has IOMMU page
>>> tables, libxl__device_pci_add() does not appear to be concerned of PoD:
>>> - xc_test_assign_device() / XEN_DOMCTL_test_assign_device only check the
>>>   device for being available to assign,
>>> - libxl__device_pci_setdefault() is only concerned about the RDM policy,
>>> - other functions called look to not be related to such checking at all.
>>
>> I've now verified this to be the case. In fact creating the guest and
>> assigning it a device while the guest still sits in the boot loader
>> allowed the (oldish) Linux guest I've been using to recognize the device
>> (and hence load its driver) even without any hotplug driver. Obviously
>> while still in the boot loader ...
>>
>>> IMO assignment should fail if pod.count != pod.entry_count,
>>
>> ... this holds, and hence assignment should have failed.
>>
>> IOW this approach currently is a simple "workaround" to avoid the "PCI
>> device assignment for HVM guest failed due to PoD enabled" error upon
>> domain creation.
>>
>> I'll see if I can find a reasonable place to add the missing check; I'm
>> less certain about ...
>>
>>> and all PoD
>>> entries should be resolved otherwise (whether explicitly by the
>>> hypervisor or through some suitable existing hypercalls - didn't check
>>> yet whether there are any reasonable candidates - by the tool stack is
>>> secondary).
>>
>> ... the approach to take here.
> 
> I think forcing all entries to be resolved would be unexpected when
> assigning a device.
> 
> I would rather print a message saying that either the guest must
> balloon down to the requested amount of memory, or that all entries
> should be resolved (ie: using mem-set to match the mem-max value).

But ballooning down alone doesn't help. That only ensures there is
enough memory in the PoD cache to fill all PoD entries. The PoD
entries will get replaced (resolved) only as they get touched. That
touching is what I call "resolving them", and what would be needed
for assignment to be safe (for the guest). Expecting the guest to
do anything about this is imo not very reasonable; it can only be
tool stack or the hypervisor effecting this.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.