|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 00/10] vnuma introduction
On ven, 2014-07-18 at 12:48 +0100, Wei Liu wrote:
> On Fri, Jul 18, 2014 at 12:13:36PM +0200, Dario Faggioli wrote:
> > On ven, 2014-07-18 at 10:53 +0100, Wei Liu wrote:
> > > I've also encountered this. I suspect that even if you disble SMT with
> > > cpuid in config file, the cpu topology in guest might still be wrong.
> > >
> > Can I ask why?
> >
>
> Because for a PV guest (currently) the guest kernel sees the real "ID"s
> for a cpu. See those "ID"s I change in my hacky patch.
>
Right, now I see/remember it. Well, this is, I think, something we
should try to fix _independently_ from vNUMA, isn't it?
I mean, even right now, PV guests see completely random cache-sharing
topology, and that does (at least potentially) affect performance, as
the guest scheduler will make incorrect/inconsistent assumptions.
I'm not sure what the correct fix is. Probably something similar to what
you're doing in your hack... but, indeed, I think we should do something
about this!
> > > What do hwloc-ls and lscpu show? Do you see any weird topology like one
> > > core belongs to one node while three belong to another?
> > >
> > Yep, that would be interesting to see.
> >
> > > (I suspect not
> > > because your vcpus are already pinned to a specific node)
> > >
> > Sorry, I'm not sure I follow here... Are you saying that things probably
> > works ok, but that is (only) because of pinning?
>
> Yes, given that you derive numa memory allocation from cpu pinning or
> use combination of cpu pinning, vcpu to vnode map and vnode to pnode
> map, in those cases those IDs might reflect the right topology.
>
Well, pinning does (should?) not always happen, as a consequence of a
virtual topology being used.
So, again, I don't think we should rely on pinning to have a sane and,
more important, consistent SMT and cache sharing topology.
Linux maintainers, any ideas?
BTW, I tried a few examples, on the following host:
root@benny:~# xl info -n
...
nr_cpus : 8
max_cpu_id : 15
nr_nodes : 1
cores_per_socket : 4
threads_per_core : 2
cpu_mhz : 3591
...
cpu_topology :
cpu: core socket node
0: 0 0 0
1: 0 0 0
2: 1 0 0
3: 1 0 0
4: 2 0 0
5: 2 0 0
6: 3 0 0
7: 3 0 0
numa_info :
node: memsize memfree distances
0: 34062 31029 10
With the following guest configuration, in terms of vcpu pinning:
1) 2 vCPUs ==> same pCPUs
root@benny:~# xl vcpu-list
Name ID VCPU CPU State Time(s) CPU Affinity
debian.guest.osstest 9 0 0 -b- 2.7 0
debian.guest.osstest 9 1 0 -b- 5.2 0
debian.guest.osstest 9 2 7 -b- 2.4 7
debian.guest.osstest 9 3 7 -b- 4.4 7
2) no SMT
root@benny:~# xl vcpu-list
Name ID VCPU CPU State Time(s) CPU
Affinity
debian.guest.osstest 11 0 0 -b- 0.6 0
debian.guest.osstest 11 1 2 -b- 0.4 2
debian.guest.osstest 11 2 4 -b- 1.5 4
debian.guest.osstest 11 3 6 -b- 0.5 6
3) Random
root@benny:~# xl vcpu-list
Name ID VCPU CPU State Time(s) CPU
Affinity
debian.guest.osstest 12 0 3 -b- 1.6 all
debian.guest.osstest 12 1 1 -b- 1.4 all
debian.guest.osstest 12 2 5 -b- 2.4 all
debian.guest.osstest 12 3 7 -b- 1.5 all
4) yes SMT
root@benny:~# xl vcpu-list
Name ID VCPU CPU State Time(s) CPU
Affinity
debian.guest.osstest 14 0 1 -b- 1.0 1
debian.guest.osstest 14 1 2 -b- 1.8 2
debian.guest.osstest 14 2 6 -b- 1.1 6
debian.guest.osstest 14 3 7 -b- 0.8 7
And, in *all* these 4 cases, here's what I see:
root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/core_siblings_list
0-3
0-3
0-3
0-3
root@debian:~# cat /sys/devices/system/cpu/cpu*/topology/thread_siblings_list
0-3
0-3
0-3
0-3
root@debian:~# lstopo
Machine (488MB) + Socket L#0 + L3 L#0 (8192KB) + L2 L#0 (256KB) + L1 L#0 (32KB)
+ Core L#0
PU L#0 (P#0)
PU L#1 (P#1)
PU L#2 (P#2)
PU L#3 (P#3)
root@debian:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 4
Core(s) per socket: 1
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 3591.780
BogoMIPS: 7183.56
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
I.e., no matter how I pin the vcpus, the guest sees the 4 vcpus as if
they were all SMT siblings, within the same core, sharing all cache
levels.
This is not the case for dom0 where (I booted with dom0_max_vcpus=4 on
the xen command line) I see this:
root@benny:~# lstopo
Machine (422MB)
Socket L#0 + L3 L#0 (8192KB)
L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0
PU L#0 (P#0)
PU L#1 (P#1)
L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1
PU L#2 (P#2)
PU L#3 (P#3)
root@benny:~# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 3591.780
BogoMIPS: 7183.56
Hypervisor vendor: Xen
Virtualization type: none
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
What am I doing wrong, or what am I missing?
Thanks and Regards,
Dario
--
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |