[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: DMA restriction and NUMA node number


  • To: Julien Grall <julien@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 13 Jul 2021 12:21:09 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=YPogjXGnneTfopHremv1nQK2ZA0ZEzkrqvTgoUvG+No=; b=YmVOdv1G7dDWRN3lT7S3Xq1mxMuw8qa+ddgesEbxEIwIgbCTZxgTZRGL4rPB/cuMplMLtD+uCHm1FxmBlQpIBsiiJBX9GmVXgOd2AwOhsfw7CIl71gKLYMygFP2DlppP2nkQHpGugbY8nHL3TTFUxwlAXdWJj16qU2n/FNIVAyskxb44vRgcdP1iep4sswZ+5AvYeqfVBYuA0XPXKH7aCQL4OqC5gOM/XZRJ8X0e1mTMC4NB4Q1afszY1AE+G+AV3EVH/hi59v+lcLVS69/Ir2bEnfGolhVbhlnp4AB1z7CbG1w3pJo4/ydZDcWVqF3ESl6+WLFkGle34e7IPN53jw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vuzqk5t5i55oceFN4lsD8S8OjcNHlBsMbJSm4xioe0CAqudPSg5w7RH5Oo9n3HENFqcYVLRc/Fv8Knw5qNiPSgfhcdsqdmV/In5oiX4YX3lZ4kJWYMRJPpoa3lYE+b2FmenCBapBls3mYJik0+uU5b/JeBEJGp+QTQSdBG/2iQXtmjjJosGijr2eeQBvrrgnUQSoSq2YmJDcvl+BQ9LCq8AsYsETsB2Wv6dtUP5Sqz3ZFwhymg4UPn6caYBnIdPEqh3f4ed43xMMA9NWA0DcAFvs7vSP+ujx01WziaiJoGkd4PnQKBHGIf3ZOPz1FLidoBhIEVBOZ3y1E2xXR1L2xQ==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;
  • Cc: Penny Zheng <Penny.Zheng@xxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, "xen-devel@xxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Tue, 13 Jul 2021 10:21:27 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 13.07.2021 11:26, Julien Grall wrote:
> On 13/07/2021 04:19, Wei Chen wrote:
>> I am doing some NUMA testing on Xen. And I find the DMA restriction is
>> based on NUMA node number [1].
>>      if ( !dma_bitsize && (num_online_nodes() > 1) )
>>          dma_bitsize = arch_get_dma_bitsize();
>>
>> On Arm64, we will set dma_bitsize [2] to 0, that means we don't need to
>> reserve DMA memory. But when num_online_nodes > 1, the dma_bitsize
>> will override to 32. This may be caused by the Arm64 version
>> arch_get_dma_bitsize, it may be a simple implementation and not NUMA
>> aware.
>>
>> But I still quite curious about why DMA restriction depends on NUMA
>> node number.

So really do you mean "node count", not "node number"?

>> In Arm64, dma_bitsize does not change when the NUMA node
>> changes. So we didn't expect arch_get_dma_bitsize to be called here.
>>
>> I copied Keir's commit message from 2008. It seems this code was considered
>> only for x86, when he was working on it. But I'm not an x86 expert, so I
>> hope Xen x86 folks can give some help. Understanding this will help us to
> 
> It is best to CCed the relevant person so they know you have requested 
> there input. I have added the x86 maintainers in the thread.
> 
>> do some adaptations to Arm in subsequent modifications : )
>>
>> commit accacb43cb7f16e9d1d8c0e58ea72c9d0c32cec2
>> Author: Keir Fraser <keir.fraser@xxxxxxxxxx>
>> Date:   Mon Jul 28 16:40:30 2008 +0100
>>
>>      Simplify 'dma heap' logic.
>>
>>      1. Only useful for NUMA systems, so turn it off on non-NUMA systems by
>>         default.
>>      2. On NUMA systems, by default relate the DMA heap size to NUMA node 0
>>         memory size (so that not all of node 0's memory ends up being 'DMA
>>         heap').
>>      3. Remove the 'dma emergency pool'. It's less useful now that running
>>         out of low memory isn;t as fatal as it used to be (e.g., when we
>>         needed to be able to allocate low-memory PAE page directories).

So on x86 memory starts from 0, and we want to be cautious with giving
out memory that may be needed for special purposes (first and foremost
DMA). With the buddy allocator working from high addresses to lower ones,
low addresses will be used last (unless specifically requested) without
any further precautions when not taking NUMA into account. This in
particular covers the case of just a single NUMA node.

When taking NUMA into account we need to be more careful: If a single
node contains the majority (or all) of the more precious memory, we
want to prefer non-local allocations over exhausting the more precious
memory ranges. Hence we need to set aside some largely arbitrary amount
allocation of which would happen only after also exhausting all other
nodes' memory.

I hope I have suitably reconstructed the thinking back then. And yes,
there are x86 implications in here.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.