Re: [Xen-devel] Re: Re: mapping problems in xenpaging

To:	Tim Deegan <tim@xxxxxxx>, Andres Lagar Cavilla <andres@xxxxxxxxxxxxxxxx>
Subject:	Re: [Xen-devel] Re: Re: mapping problems in xenpaging
From:	Keir Fraser <keir.xen@xxxxxxxxx>
Date:	Mon, 10 Oct 2011 11:06:02 +0100
Cc:	Olaf Hering <olaf@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, zhen shi <bickys1986@xxxxxxxxx>, Adin Scannell <adin@xxxxxxxxxxxxxx>
Delivery-date:	Mon, 10 Oct 2011 03:07:58 -0700
Dkim-signature:	v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=user-agent:date:subject:from:to:cc:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; bh=t4pTkn9GVWOaVyFQRUR0AeF4XcqJmh3QzY7jLq0qsu8=; b=J6p4FsPyeDn0xWro+iTn6yWuwySOX/BxweiTARlMDNSJ1qiKj2UvRHy5F5ZhcNpDVA Ry79AOOqKWLd3RcL4kkN2NgCPD82QhD51Vkz/bwKbU8R5DJnKMMBOLPFfn38yBUTT4Zb qiCBipFDqqp7qpe0kLOITxspMiJRkdD0h59c0=
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<20111010092111.GB31800@xxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcyHNDcj60+WNXNrK0uZ9zGSoYCFSw==
Thread-topic:	[Xen-devel] Re: Re: mapping problems in xenpaging
User-agent:	Microsoft-Entourage/12.30.0.110427

On 10/10/2011 10:21, "Tim Deegan" <tim@xxxxxxx> wrote:

> At 21:20 -0400 on 09 Oct (1318195224), Andres Lagar Cavilla wrote:
>> I have a proposal. I'd like to hear from the list what they think.
>> 
>> - 1. change p2m lock to a read/write lock
>> - 2. Make lookups (gfn_to_mfn_* family) take a read lock. All current
>> callers of p2m_lock will become write lockers.
>> - 3. Change the gfn_to_mfn_* family to get_page on the mfn obtained,
>> while holding the read lock.
>> - 4. Have all lookup callers put_page on the obtained mfn, once done.
> 
> This seems like a step in the right direction, but if we're going to
> make this big an interface change there might be better interfaces to
> end up with.
> 
> A few issues I can see with it:
>  - p2m lookups are on some very performance-sensitive paths
>    (e.g. multiple times in any pagetable walk or instruction emulation
>    in a HVM guest) so adding the rwlock might have a noticeable impact.

If the read sections are short, may as well use a plain spinlock.

The best (but hard) way to make the locking cheaper is to work out a way to
use finer-grained locks (e.g., per-page / per-mapping) or avoid locks
altogether (e.g., RCU).

Multi-reader locks are rarely going to be a good choice in the hypervisor.

A good first step anyhow would be to make the p2m_ synchronisation correct,
and then optimise it. Sounds like that is hard enough. :-)

 -- Keir

>  - This fixes one class of races (page gets freed-to-xen underfoot) but
>    leaves another one (gfn -> mfn map changes underfoot) untouched.  In
>    particular it doesn't solve the race where a foreign mapper
>    gets a r/w map of what's meant to be a read-only frame.
> 
> I think that to fix things properly we need to have some refcount
> associated with the p2m mapping itself.  That would be taken by all
> lookups (or at least some - we could have a flag to the p2m lookup) and
> released as you suggest, but more importantly it would block all p2m changes
> while the count was raised.  (I think that a least in the common case we
> could encode such a refcount using the existing typecount).
> 
> One problem then is how to make all the callers of the p2m update
> functions handle failure, either by waiting (new deadlock risk?) or
> returning EAGAIN at the hypercall interface.  Paths where the update
> isn't caused by an explicit request (like log-dirty and the mem-event
> rx-to-rw conversion) would be particularly tricky.
> 
> More seriously, it introduces another round of the sort of priority
> inversion we already get with the existing refcounts - a foreign
> mapping, caused by a user-space program in another VM, could arbitrarily
> delay a p2m update (and so prevent a VCPU from making progress), without
> any mechanism to even request that the mapping be removed.
> 
> Any ideas how to avoid that?  Potentially with some extra bookkeeping on
> foreign mappings we could revoke or redirect them when the p2m changes.
> That would fit nicely with the abstraction in the interfaces where HVM
> domains' memory is always indexed by pfn.  I can imagine it being quite
> tricky though.
> 
>> I'm more wary that turning p2m locking into read/write will result in
>> code deadlocking itself: taking a read lock first and a write lock
>> later. Possibly the current rwlock implementation could be improved to
>> keep a cpumask of read-lockers, and provide an atomic "promote from
>> read to write" atomic operation (something along the lines of wait
>> until you're the only reader in the cpumask, and then cmpxchg(lock,
>> -1, WRITE_BIAS))
> 
> I think that would deadlock if two cpus tried it at once.
> 
> Tim.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Re: Re: mapping problems in xenpaging