This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] turn off writable page tables

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] turn off writable page tables
From: Andrew Theurer <habanero@xxxxxxxxxx>
Date: Thu, 27 Jul 2006 09:43:56 -0500
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Gerd Hoffmann <kraxel@xxxxxxx>
Delivery-date: Thu, 27 Jul 2006 07:44:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <A95E2296287EAD4EB592B5DEEFCE0E9D572247@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <A95E2296287EAD4EB592B5DEEFCE0E9D572247@xxxxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (Windows/20060516)

fork a quite linear from small number to large number of dirty pages.
Below are the min and max:

         1280 pages    128000 pages
wtpt:     813 usec      37552 usec
emulate: 3279 usec     283879 usec

Good, at least that suggests that the code works for the usage it was
intended for.
So, in a -perfect-world- this works great.  Problem is most workloads
don't appear to have a vast percentage of entries that need to be
updated.   I'll go ahead and  expand this test to find out what the
threshold is to break even.  I'll also see if we can implement a
call in fork to update the parent -I hope this will show just as good
performance even when most entries need modification and even better
performance over wtpt with a low number of entries modified.

With license to make more invasive changes to core Linux mm it certainly
should be possible to optimize this specific case with a batched update
fairly easily. You could even go further an implement a 'make all PTEs
in pagetable RO' hypercall, possibly including a copy to the child. This
could potentially work better than current 'late pin', at least the
validation would be incremental rather than in one big hit at the end.
FWIW, I found the threshold for emulate vs wtpt. I ran the fork test with a set number of pages dirtied such that we had x number of PTEs per pte_page.

#pte usec
002 5242
004 5251
006 5373
008 5519
010 5873

#pte usec
002 4922
004 5265
006 6074
008 6991
010 7806
012 5988

So, the threshold appears to be around 4 PTEs/page. I was a little shocked at first how low this number is, but considering the near identical performance with the various workloads, this make sense. All of the workloads had the vast majority of writable pages flushed with just 2 PTEs/page changed and a handful with more PTEs/page changed. It would not surprise me if the overall average was around 4 PTEs/page.

I am having a hard time finding any "enterprise" workloads which have a lot of PTEs/page right before fork. If anyone can point me to some, that would be great.

I will look into batching next, but I am curious if simply using a hypercall in stead of write fault + emulate will make any difference at all. I'll try that first, then implement the batched update. Eventually a hypercall which does more would be nice, but I guess we'll have to convince the Linux maintainers it's a good idea.


Xen-devel mailing list