WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: [PATCH] mem_sharing: fix race condition of nominate and

To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
From: Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
Date: Wed, 12 Jan 2011 18:03:57 +0800
Cc: tinnycloud <tinnycloud@xxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 12 Jan 2011 02:05:37 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=DuScOcRzuyQ6PL3Y5BycV4Sbw9ZcY9yEzQBNjE2+WgE=; b=ieZMTSRGFyIDLLHxEHSZoeoiUbm2wHUo0QSi9gJvRYekHUiRN5ybYe93SE4ciKOpxU ALifhOeOe1bJMSaVdzP53pcNHO3M3NtV7DtlEaerlYwcb7yRemscHGFz7B7bgtvdWZHZ bLXHLIcZVcHJbgG9efCVgz3YGJnraDTT9RLrc=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=tTR90FqJpeud6NwwUUj4bSQ3WP7YkB2fMTnYF9z4OG+i05Dent6Zc1hZ42PUhcho1U h7Kgw69EW5nGAp5863+r02qYuGthPkTsFx5uaPuLjciq4CRYpYqqV/hWtOc9Hz/ABT9z YfUg5xSDphwWQMsnG4TN+N6Smn42Faasi0qnU=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTikBaB5awvRu3Sn3WfuoS3LRmeBH=pG8c7H1n4Cw@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTinMp1v1zex2BfcUuszotPuxJFWZQNUp40gu_gxL@xxxxxxxxxxxxxx> <20110106165450.GO21948@xxxxxxxxxxxxxxxxxxxxxxx> <AANLkTinmpiusLqegGZA+bZWpDXPM+7Wq2nt8MZa0Ocet@xxxxxxxxxxxxxx> <AANLkTinf-A_4NEPQeCw0pftM5Bks8BYPRhMx3-stTHxa@xxxxxxxxxxxxxx> <BLU157-ds1E01DEBE18840E5FBB6D1DA0E0@xxxxxxx> <AANLkTikBaB5awvRu3Sn3WfuoS3LRmeBH=pG8c7H1n4Cw@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi, Tim:

On Mon, Jan 10, 2011 at 4:10 PM, Jui-Hao Chiang <juihaochiang@xxxxxxxxx> wrote:

>>
>> After this change, unshare() has a potential problem of deadlock for
>> shr_lock and p2m_lock with different locking order.
>> Assume two CPUs do the following
>> CPU1: hvm_hap_nested_page_fault() => unshare() => p2m_change_type()
>> (locking order: shr_lock, p2m_lock)
>> CPU2: p2m_teardown() => unshare() (locking order: p2m_lock, shr_lock)
>> When CPU1 grabs shr_lock and CPU2 grabs p2m_lock, they deadlock later.
>>
>>  1.       mem_sharing_unshare_page() has the routine  called from
>> gfn_to_mfn_unshare, which is called by gnttab_transfer
>>
>> Since no bug report on grant_table right now, so I think this is safe for
>> now
>>
>> Also  p2m_tear_down è mem_sharing_unshare_page() , its flag is
>> MEM_SHARING_DESTROY_GFN, and won’t has the chance to
>>
>> call set_shared_p2m_entry()
>>
>>
>
> Of course, the p2m_teardown won't call set_shared_p2m_entry. But this does
> not change my argument that p2m_teardown() hold p2m_lock to wait on
> shr_lock. Actaully, after looking for a while, I rebut myself that the
> scenario of deadlock won't exist.
> When p2m_teardown is called, the domain is dying in its last few steps
> (device, irq are released), and there is no way for
> hvm_hap_nested_page_fault() to happen on the memory of the dying domain. If
> this case is eliminated, then my patch should not have deadlock problem. Any
> comments?
>

After a discussion with tinnycloud, his test is working after applying
the previous patch
http://lists.xensource.com/archives/html/xen-devel/2010-12/txteWc7Bs5Yap.txt
(set_shared_p2m_entry is not executed since it is in ASSERT).

And after a few code tracing and testing, my own worry about the
deadlock between p2m_lock and shr_lock actually disappears as the
above discussion. So here I re-attach the patch again which includes
another fix to recover type count when nominate fails on a page (from
our previous dicussions).

See if anything wrong.

Bests,
Jui-Hao

Attachment: mem_sharing_p2mt_race.patch
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>