[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 12/18] AMD/IOMMU: allow use of superpage mappings


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 13 Dec 2021 11:00:23 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9CLfSQg9QbXlvONNL9Vd6HtsIxJegeM+xe1hhehoLuQ=; b=isYoFkvI3Khrmqz4jRQY+nt7Bz8wtANbvAFHmTsK2pg4lZye2dRs0Pcy27t6ezsZqieq8JX/ZFGq/uT3eNrHaNYuhPlw3iWXSILiek+UqRqMbBPBPsZZf7seN2iqYlXjQC5SKH4XLb4oSafYjhvytoPPU1REoiwDiM/ZZHqhiEjvVxrmdaWkDoluDb7One7WAoqV1ymukN9/1bey9SX//iBSZT9jAOnoTyRj64xmYuBBUT2QjVKLmPUJ6eSRtcQDC0Zrl7sXxEbt2Klig4dafGP4HiNHYu/NSWRoHUWLQP5eqzD0nTfaXIj8WXvb3kwsLS6FJfnvDmWc64UXq+bD+g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iwj+sRt3t/HrrDBj+QhwGAcqcQZr1rr1dFDi3Jnqw1kwlUqytwJ24OY1/GEMDd16C32sHrunYhrmGjsOwTmbiGeEaUEnEQ3N7dqPb8CTwqgwP8MwTL/Mose9IQfD6X8qPFbaqaNxARexvUt2DC6U2Wiz5trZOjL/qFz4W2rCvY/BCDeBib+EoMTWw4yMUWcE9oYS2SPDJS0WPYaVm4eTyiHhS9q9tmXJNiP6bxaHonjbwfrH1ceKwxsSo0u8pYHtgU8orbPlwQMSl+L2YR3ZWdnVeB51cUnzz+DyUHjgJkrFdSD//VE0GIiRzuQ6FrBw4TPckarSTox8+rqPGa3arg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Mon, 13 Dec 2021 10:00:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.12.2021 10:45, Roger Pau Monné wrote:
> On Mon, Dec 13, 2021 at 09:49:50AM +0100, Jan Beulich wrote:
>> On 10.12.2021 16:06, Roger Pau Monné wrote:
>>> On Fri, Sep 24, 2021 at 11:52:14AM +0200, Jan Beulich wrote:
>>>> ---
>>>> I'm not fully sure about allowing 512G mappings: The scheduling-for-
>>>> freeing of intermediate page tables can take quite a while when
>>>> replacing a tree of 4k mappings by a single 512G one. Plus (or otoh)
>>>> there's no present code path via which 512G chunks of memory could be
>>>> allocated (and hence mapped) anyway.
>>>
>>> I would limit to 1G, which is what we support for CPU page tables
>>> also.
>>
>> I'm not sure I buy comparing with CPU side support when not sharing
>> page tables. Not the least with PV in mind.
> 
> Hm, my thinking was that similar reasons that don't allow us to do
> 512G mappings for the CPU side would also apply to IOMMU. Regardless
> of that, given the current way in which replaced page table entries
> are freed, I'm not sure it's fine to allow 512G mappings as the
> freeing of the possible huge amount of 4K entries could allow guests
> to hog a CPU for a long time.

This huge amount can occur only when replacing a hierarchy with
sufficiently many 4k leaves by a single 512G page. Yet there's no
way - afaics - that such an operation can be initiated right now.
That's, as said in the remark, because there's no way to allocate
a 512G chunk of memory in one go. When re-coalescing, the worst
that can happen is one L1 table worth of 4k mappings, one L2
table worth of 2M mappings, and one L3 table worth of 1G mappings.
All other mappings already need to have been superpage ones at the
time of the checking. Hence the total upper bound (for the
enclosing map / unmap) is again primarily determined by there not
being any way to establish 512G mappings.

Actually, thinking about it, there's one path where 512G mappings
could be established, but that's Dom0-reachable only
(XEN_DOMCTL_memory_mapping) and would assume gigantic BARs in a
PCI device. Even if such a device existed, I think we're fine to
assume that Dom0 won't establish such mappings to replace
existing ones, but only ever put them in place when nothing was
mapped in that range yet.

> It would be better if we could somehow account this in a per-vCPU way,
> kind of similar to what we do with vPCI BAR mappings.

But recording them per-vCPU wouldn't make any difference to the
number of pages that could accumulate in a single run. Maybe I'm
missing something in what you're thinking about here ...

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.