This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [PATCH RFC] vmalloc: eagerly clear ptes on vunmap

To: Nick Piggin <npiggin@xxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH RFC] vmalloc: eagerly clear ptes on vunmap
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue, 30 Nov 2010 09:45:47 -0800
Cc: "Xen-devel@xxxxxxxxxxxxxxxxxxx" <Xen-devel@xxxxxxxxxxxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>, Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx>, Bryan Schumaker <bjschuma@xxxxxxxxxx>, Linux Memory Management List <linux-mm@xxxxxxxxx>, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 30 Nov 2010 09:46:37 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20101130124249.GB15778@amd>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4CEF6B8B.8080206@xxxxxxxx> <20101127103656.GA6884@amd> <4CF40DCB.5010007@xxxxxxxx> <20101130124249.GB15778@amd>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Lightning/1.0b3pre Thunderbird/3.1.6
On 11/30/2010 04:42 AM, Nick Piggin wrote:
> On Mon, Nov 29, 2010 at 12:32:11PM -0800, Jeremy Fitzhardinge wrote:
>> When unmapping a region in the vmalloc space, clear the ptes immediately.
>> There's no point in deferring this because there's no amortization
>> benefit.
>> The TLBs are left dirty, and they are flushed lazily to amortize the
>> cost of the IPIs.
>> This specific motivation for this patch is a regression since 2.6.36 when
>> using NFS under Xen, triggered by the NFS client's use of vm_map_ram()
>> introduced in 56e4ebf877b6043c289bda32a5a7385b80c17dee.  XFS also uses
>> vm_map_ram() and could cause similar problems.
> I do wonder whether there are cache benefits from batching page table
> updates, especially the batched per cpu maps

Perhaps.  But perhaps there are cache benefits in clearing early because
the ptes are still in cache from when they were set?
>  (and in your version they
> get double-cleared as well).

I thought I'd avoided that.  Oh, right, in both vb_free(), and again -
eventually - in free_vmap_block->free_unmap_vmap_area_noflush.

Delta patch below.

>   I think this patch is good, but I think
> perhaps making it configurable would be nice.

I'd rather not unless there's a strong reason to do so.

It occurs to me that once you remove the lazily mapped ptes, then all
that code is doing is keeping track of ranges of addresses with dirty
tlb entries.  But on x86 at least, any kernel tlb flush is a global one,
so keeping track of fine-grain address information is overkill.  I
wonder if the overall code can be simplified as a result?

On a more concrete level, vmap_page_range_noflush() and
vunmap_page_range() could be implemented with apply_to_page_range()
which removes a chunk of boilerplate code (however, it would result in a
callback per pte rather than one per pte page - but I'll fix that now).

> So... main question, does it allow Xen to use lazy flushing and avoid
> vm_unmap_aliases() calls?

Yes, it seems to.


Subject: [PATCH] vmalloc: avoid double-unmapping percpu blocks

The area has always been unmapped by the time free_vmap_block() is
called, so there's no need to unmap it again.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9551316..ade3302 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -520,13 +520,12 @@ static void purge_vmap_area_lazy(void)
- * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been
- * called for the correct range previously.
+ * Free a vmap area, caller ensuring that the area has been unmapped
+ * and flush_cache_vunmap had been called for the correct range
+ * previously.
-static void free_unmap_vmap_area_noflush(struct vmap_area *va)
+static void free_vmap_area_noflush(struct vmap_area *va)
-       unmap_vmap_area(va);
        va->flags |= VM_LAZY_FREE;
        atomic_add((va->va_end - va->va_start) >> PAGE_SHIFT, &vmap_lazy_nr);
        if (unlikely(atomic_read(&vmap_lazy_nr) > lazy_max_pages()))
@@ -534,6 +533,16 @@ static void free_unmap_vmap_area_noflush(struct vmap_area 
+ * Free and unmap a vmap area, caller ensuring flush_cache_vunmap had been
+ * called for the correct range previously.
+ */
+static void free_unmap_vmap_area_noflush(struct vmap_area *va)
+       unmap_vmap_area(va);
+       free_vmap_area_noflush(va);
  * Free and unmap a vmap area
 static void free_unmap_vmap_area(struct vmap_area *va)
@@ -709,7 +718,7 @@ static void free_vmap_block(struct vmap_block *vb)
        BUG_ON(tmp != vb);
-       free_unmap_vmap_area_noflush(vb->va);
+       free_vmap_area_noflush(vb->va);
        call_rcu(&vb->rcu_head, rcu_free_vb);

Xen-devel mailing list