[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NUMA_BALANCING and Xen PV guest regression in 3.20-rc0



Hi everyone,

On Thu, 2015-02-19 at 17:01 +0000, Mel Gorman wrote:
> On Thu, Feb 19, 2015 at 01:06:53PM +0000, David Vrabel wrote:

> I cannot think of a reason why this would fail for NUMA balancing on bare
> metal. The PAGE_NONE protection clears the present bit on p[te|md]_modify
> so the expectations are matched before or after the patch is applied. So,
> for bare metal at least
> 
> Acked-by: Mel Gorman <mgorman@xxxxxxx>
> 
> I *think* this will work ok with Xen but I cannot 100% convince myself.
> I'm adding Wei Liu to the cc who may have a Xen PV setup handy that
> supports NUMA and may be able to test the patch to confirm.
> 
I'm not Wei, but I've been able to test a kernel with David's patch in
the following conditions:

 1. as Dom0 kernel, when Xen does not have any virtual NUMA support
 2. as DomU PV kernel, when Xen does not have any virtual NUMA support
 3. as DomU PV kernel, when Xen _does_ _have_ virtual NUMA support
    (i.e., Wei's code)

Cases 1. and 2. have been, I believe, tested by David already, but
anyways... :-)

Case 3. worked well for me, as the following commands show. In fact,
with this in guest config file:

vnuma = [ [ "pnode=0","size=1000","vcpus=0-3","vdistances=10,20"  ],
          [ "pnode=1","size=1000","vcpus=4-7","vdistances=20,10"  ],
        ]

This is what I get from inside the guest:

    root@test-pv:~# numactl --hardware
    available: 2 nodes (0-1)
    node 0 cpus: 0 1 2 3
    node 0 size: 951 MB
    node 0 free: 868 MB
    node 1 cpus: 4 5 6 7
    node 1 size: 968 MB
    node 1 free: 924 MB
    node distances:
    node   0   1 
      0:  10  20 
      1:  20  10

And this is it from the host:

    root@Zhaman:~# xl debug-keys u ; xl dmesg |tail -12
    (XEN) Memory location of each domain:
    (XEN) Domain 0 (total: 1047417):
    (XEN)     Node 0: 1031009
    (XEN)     Node 1: 16408
    (XEN) Domain 1 (total: 512000):
    (XEN)     Node 0: 256000
    (XEN)     Node 1: 256000
    (XEN)      2 vnodes, 8 vcpus, guest physical layout:
    (XEN)          0: pnode   0, vcpus 0-3 
    (XEN)            0000000000000000 - 000000003e800000
    (XEN)          1: pnode   1, vcpus 4-7
    (XEN)            000000003e800000 - 000000007d000000


Still inside the guest, I see this:

    root@test-pv:~# cat /proc/sys/kernel/numa_balancing
    1

And this:

    root@test-pv:~# grep numa /proc/vmstat 
    numa_hit 65987
    numa_miss 0
    numa_foreign 0
    numa_interleave 14473
    numa_local 58642
    numa_other 7345
    numa_pte_updates 596
    numa_huge_pte_updates 0
    numa_hint_faults 479
    numa_hint_faults_local 420
    numa_pages_migrated 51

So, yes, I would say this wok with Xen, is that correct, Mel?

I'll give it a try at running more complex stuff like 'perf bench numa'
inside the guest and see what happens...

Regards,
Dario

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.