[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [for-4.9] Re: HVM guest performance regression



>>> On 30.05.17 at 12:33, <jgross@xxxxxxxx> wrote:
> On 30/05/17 09:24, Jan Beulich wrote:
>>>>> On 29.05.17 at 21:05, <jgross@xxxxxxxx> wrote:
>>> Creating the domains with
>>>
>>> xl -vvv create ...
>>>
>>> showed the numbers of superpages and normal pages allocated for the
>>> domain.
>>>
>>> The following allocation pattern resulted in a slow domain:
>>>
>>> xc: detail: PHYSICAL MEMORY ALLOCATION:
>>> xc: detail:   4KB PAGES: 0x0000000000000600
>>> xc: detail:   2MB PAGES: 0x00000000000003f9
>>> xc: detail:   1GB PAGES: 0x0000000000000000
>>>
>>> And this one was fast:
>>>
>>> xc: detail: PHYSICAL MEMORY ALLOCATION:
>>> xc: detail:   4KB PAGES: 0x0000000000000400
>>> xc: detail:   2MB PAGES: 0x00000000000003fa
>>> xc: detail:   1GB PAGES: 0x0000000000000000
>>>
>>> I ballooned dom0 down in small steps to be able to create those
>>> test cases.
>>>
>>> I believe the main reason is that some data needed by the benchmark
>>> is located near the end of domain memory resulting in a rather high
>>> TLB miss rate in case of not all (or nearly all) memory available in
>>> form of 2MB pages.
>> 
>> Did you double check this by creating some other (persistent)
>> process prior to running your benchmark? I find it rather
>> unlikely that you would consistently see space from the top of
>> guest RAM allocated to your test, unless it consumes all RAM
>> that's available at the time it runs (but then I'd consider it
>> quite likely for overhead of using the few smaller pages to be
>> mostly hidden in the noise).
>> 
>> Or are you suspecting some crucial kernel structures to live
>> there?
> 
> Yes, I do. When onlining memory at boot time the kernel is using the new
> memory chunk to add the page structures and if needed new kernel page
> tables. It is normally allocating that memory from the end of the new
> chunk.

The page tables are 4k allocations, sure. But the page structures
surely would be allocated with higher granularity?

>>>>> What makes the whole problem even more mysterious is that the
>>>>> regression was detected first with SLE12 SP3 (guest and dom0, Xen 4.9
>>>>> and Linux 4.4) against older systems (guest and dom0). While trying
>>>>> to find out whether the guest or the Xen version are the culprit I
>>>>> found that the old guest (based on kernel 3.12) showed the mentioned
>>>>> performance drop with above commit. The new guest (based on kernel
>>>>> 4.4) shows the same bad performance regardless of the Xen version or
>>>>> amount of free memory. I haven't found the Linux kernel commit yet
>>>>> being responsible for that performance drop.
>>>
>>> And this might be result of a different memory usage of more recent
>>> kernels: I suspect the critical data is now at the very end of the
>>> domain's memory. As there are always some pages allocated in 4kB
>>> chunks the last pages of the domain will never be part of a 2MB page.
>> 
>> But if the OS allocated large pages internally for relevant data
>> structures, those obviously won't come from that necessarily 4k-
>> mapped tail range.
> 
> Sure? I think the kernel is using 1GB pages if possible for direct
> kernel mappings of the physical memory. It doesn't care for the last
> page mapping some space not populated.

Are you sure? I would very much hope for Linux to not establish
mappings to addresses where no memory (and no MMIO) resides.
But I can't tell for sure for recent Linux versions; I do know in the
old days they were quite careful there.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.