Xen project Mailing List

Re: [Xen-devel] [PATCH] x86/domain_page: implement pure per-vCPU mapping infrastructure

From: "Xia, Hongyan" <hongyxia@xxxxxxxxxx>

Date: Tue, 4 Feb 2020 16:02:38 +0000

Accept-language: en-GB, en-US

Cc: "andrew.cooper3@xxxxxxxxxx" <andrew.cooper3@xxxxxxxxxx>, "Grall, Julien" <jgrall@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "jbeulich@xxxxxxxx" <jbeulich@xxxxxxxx>, "roger.pau@xxxxxxxxxx" <roger.pau@xxxxxxxxxx>

Delivery-date: Tue, 04 Feb 2020 16:02:44 +0000

Ironport-sdr: 5LziEAaTU5rPvHmiWu3yDEgxN/jZFnMaFMudqlW6X8i1cc/b4wwvTRHr+pFl3s/EJd9W/apShq 58q9iNyEFHFg==

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHV21PsYByQ0ZyJ+UmPk89DPZdobagLMmMA

Thread-topic: [PATCH] x86/domain_page: implement pure per-vCPU mapping infrastructure

On Tue, 2020-02-04 at 12:08 +0000, Wei Liu wrote: > Thanks, I welcome effort to make Xen more scalable. :-) > > On Mon, Feb 03, 2020 at 06:36:53PM +0000, Hongyan Xia wrote: > > Rewrite the mapcache to be purely per-vCPU instead of partly per- > > vCPU > > and partly per-domain. > > > > This patch is needed to address performance issues when we start > > relying > > on the mapcache, e.g., when we do not have a direct map. Currently, > > the > > per-domain lock on the mapcache is a bottleneck for multicore, > > causing > > performance degradation and even functional regressions. This patch > > makes the mapping structure per-vCPU and completely lockless. > > > > When I see "regression", I think of something that was working before > but not anymore. I don't think that's the case for the following two > things. > > I would just classify them as bug and/or improvement. We probably have different definitions for "regression"... but I can reword. > > Functional regression: > > > > When a domain is run on more than 64 cores, FreeBSD 10 panicks > > frequently > > due to occasional simultaneous set_singleshot_timer hypercalls from > > too > > many cores. Some cores will be blocked waiting on map_domain_page, > > eventually failing to set a timer in the future. FreeBSD cannot > > handle > > this and panicks. This was fixed in later FreeBSD releases by > > handling > > -ETIME, but still the degradation in timer performance is a big > > issue. > > > > Performance regression: > > > > Many benchmarks see a performance drop when having a large core > > count. > > I have done a Geekbench on a 32-vCPU guest. > > > > perf drop old new > > single 0.04% 0.18% > > multi 2.43% 0.08% > > > > Removing the per-domain lock in the mapcache brings the multi-core > > performance almost identical to using the direct map for mappings. > > > > There should be room for futher optimisations, but this already > > improves over the old mapcache by a lot. > > > > Note that entries in the maphash will occupy inuse slots. With 16 > > slots > > per vCPU and a maphash capacity of 8, we only have another 8 > > available, > > which is not enough for nested page table walks. We need to > > increase the > > number of slots in config.h. > > > > Signed-off-by: Hongyan Xia <hongyxia@xxxxxxxxxx> > > > As far as I can tell all vcpus still share the same per-domain > mapcache > region. The difference is now the region is divided into subregion > for > each vcpu to use. You are right. We have a per-domain VA range and we have create_perdomain_mappings which can be reused nicely. The patch divides the regions into vCPUs. > > > --- > > xen/arch/x86/domain.c | 5 +- > > xen/arch/x86/domain_page.c | 229 +++++++++++------------------ > > ------ > > xen/include/asm-x86/config.h | 2 +- > > xen/include/asm-x86/domain.h | 30 +---- > > 4 files changed, 80 insertions(+), 186 deletions(-) > > > > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > > index f53ae5ff86..a278aa4678 100644 > > --- a/xen/arch/x86/domain.c > > +++ b/xen/arch/x86/domain.c > > @@ -445,6 +445,9 @@ void arch_vcpu_destroy(struct vcpu *v) > > xfree(v->arch.msrs); > > v->arch.msrs = NULL; > > > > + xfree(v->arch.pv.mapcache); > > + v->arch.pv.mapcache = NULL; > > + > > Keep in mind that accessing the union {pv, hvm} before knowing the > exact variant is dangerous. > > Because mapcache is initialised for PV only, so it should be freed > for > PV only. I think you need to put it to pv_vcpu_destroy. > > If you don't want to do that because of asymmetry with > mapcache_vcpu_init, you may want to invent mapcache_vcpu_destroy for > your purpose. Ah right this is a problem. I was working on a tree where everyone has a mapcache, which is not true for current upstream when I cherry- picked. Will fix. Hongyan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.