[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 10/21] AMD/IOMMU: allow use of superpage mappings


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 5 May 2022 16:34:54 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XKvnMLOx6F5zJw3pKTDY0Uhbi7RbOqyNost6ZW7B7JA=; b=Ce/Z6aMnf1DB9aMlD/rOtmJpaHmkEFVv/qLtUcDr+Hmz0pGHQ76MvSdJi1N0FbNuRFgkHsErcZjhfxyXzZJqlrtN9jp1zxWZfU/h03fe13pD/NibtweZE3UP3cv2Lfpmn3im6/SsMQc4ZOD5QjtE2HvGGEJDZCWosFd4bAbFEZ60Y+okmk7Nhu0LS8U+YUPwVxhNQEVeFu7yQVsm21y35Fa2RrNjbIySP0FOlOCl6Hh24usib5dXRA3/9psKQHiQ5wmwMrlZb27Y0AqRrIUFBJJkBj9UolXTa4AIWVw1gYztvtDks6Uc4DpwcilfAsAfV/fFrEV8+x+2eGr+9pVb/A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=HvNc9cpanvT/a/YmmuzcD8OdAJIa5Nfaf58Lg/X7a7Nnuo7RwJikUKfJDzb76ntclR+xdR8pSD4K0xBWrRrHlTTYP58O3D2etL/Kd1JLQV37Boo/Xp0eJEZPEb0es2yS2I6arT0skzCvxu/KDTZgbqelzTigZtc5RWuaTaSA+3hfdNwmngF4QfIpB1Vm1Rwe3OTxgLLUFTGnRW4A7kR/l5vixGgA/oLEZywoyKzi5f4fg1G0pHw6QxJAL+hMWkW/YhIWbh0YrtwyvI1ex6Y18wn2XPqBH+fasruTrvpNwRMVzNScRhtE2uqSi+m0VT7V/CZclIZov9cxAaH5cdh5IA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Thu, 05 May 2022 14:35:07 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 05.05.2022 15:19, Roger Pau Monné wrote:
> On Mon, Apr 25, 2022 at 10:38:06AM +0200, Jan Beulich wrote:
>> No separate feature flags exist which would control availability of
>> these; the only restriction is HATS (establishing the maximum number of
>> page table levels in general), and even that has a lower bound of 4.
>> Thus we can unconditionally announce 2M, 1G, and 512G mappings. (Via
>> non-default page sizes the implementation in principle permits arbitrary
>> size mappings, but these require multiple identical leaf PTEs to be
>> written, which isn't all that different from having to write multiple
>> consecutive PTEs with increasing frame numbers. IMO that's therefore
>> beneficial only on hardware where suitable TLBs exist; I'm unaware of
>> such hardware.)
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Thanks.

>> ---
>> I'm not fully sure about allowing 512G mappings: The scheduling-for-
>> freeing of intermediate page tables would take quite a while when
>> replacing a tree of 4k mappings by a single 512G one. Yet then again
>> there's no present code path via which 512G chunks of memory could be
>> allocated (and hence mapped) anyway, so this would only benefit huge
>> systems where 512 1G mappings could be re-coalesced (once suitable code
>> is in place) into a single L4 entry. And re-coalescing wouldn't result
>> in scheduling-for-freeing of full trees of lower level pagetables.
> 
> I would think part of this should go into the commit message, as to
> why enabling 512G superpages is fine.

Together with what you say at the bottom I wonder whether, rather than
moving this into the description in a slightly edited form, I shouldn't
drop the PAGE_SIZE_512G there. I don't think that would invalidate your
R-b.

>> @@ -384,7 +406,7 @@ int cf_check amd_iommu_map_page(
>>          return rc;
>>      }
>>  
> 
> I think it might be helpful to assert or otherwise check that the
> input order is supported by the IOMMU, just to be on the safe side.

Well, yes, I can certainly do so. Given how the code was developed it
didn't seem very likely that such a fundamental assumption could be
violated, but I guess I see your point.

Jan

>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> @@ -747,7 +747,7 @@ static void cf_check amd_dump_page_table
>>  }
>>  
>>  static const struct iommu_ops __initconst_cf_clobber _iommu_ops = {
>> -    .page_sizes = PAGE_SIZE_4K,
>> +    .page_sizes = PAGE_SIZE_4K | PAGE_SIZE_2M | PAGE_SIZE_1G | 
>> PAGE_SIZE_512G,
> 
> As mentioned on a previous email, I'm worried if we ever get to
> replace an entry populated with 4K pages with a 512G superpage, as the
> freeing cost of the associated pagetables would be quite high.
> 
> I guess we will have to implement a more preemptive freeing behavior
> if issues arise.
> 
> Thanks, Roger.
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.