WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Re: Should shadow_lock be spin_lock_recursive?

To: "Ling, Xiaofeng" <xiaofeng.ling@xxxxxxxxx>, "Sharma, Arun" <arun.sharma@xxxxxxxxx>
Subject: RE: [Xen-devel] Re: Should shadow_lock be spin_lock_recursive?
From: "Michael A Fetterman" <Michael.Fetterman@xxxxxxxxxxxx>
Date: Thu, 12 May 2005 13:59:29 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 12 May 2005 12:58:53 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4282B804.5010403@xxxxxxxxx>
Keywords: Archived
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcVWlfhjLUmgZEpsSdq3W9FILshy9QAW7Buw
I think there's another bug somewhere that's provoking this.
There may be a flaw in my argument, below, but I currently think this
argument is correct:

Consider the call tree:

free_dom_mem() is trying to get rid of all shadow references to page X,
so that it can relinguish page X back to the free list.
*Note that free_dom_mem() has done a get_page() on X, so X's refcount
must be >= 1...

free_dom_mem() calls shadow_sync_and_drop_references(X),
which calls shadow_remove_all_access(X),
which calls remove_all_access_in_page(random shadow page, X),
which (when it finds references to X) calls put_page(X).

However, those calls to put_page(X) should never result in calls
to free_domheap_pages(), as X's refcount should always be >= 1
because of the get_page performed in free_dom_mem().

So that tells me the refcount on X was already broken before we got
here...

Michael 

-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Xiaofeng Ling
Sent: Thursday, May 12, 2005 2:57 AM
To: Sharma, Arun
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Re: Should shadow_lock be spin_lock_recursive?

This dead lock happened on VNIF code when enabled shadow mode.
The shadow_lock path is so complex and maybe using recurisve lock
is the easiest way to avoid dead lock currently before clearing
the path or splitting the lock .

Arun Sharma wrote:
> 
> During our testing, we found this code path where xen attempts to grab 
> the shadow_lock, while holding it - leading to a deadlock.
> 
>  >> free_dom_mem->
>  >> shadow_sync_and_drop_references->
>  >> shadow_lock -> ..................... first lock
>  >> shadow_remove_all_access->
>  >> remove_all_access_in_page->
>  >> put_page->
>  >> free_domheap_pages->
>  >> shadow_drop_references->
>  >> shadow_lock -> ..................... second lock
> 
> Questions:
> 
> - should shadow lock be recursive?
> - is shadow lock too coarse grained? It seems to have led to a lot of 
> code refactoring (__foo without lock and foo with lock). But there may 
> be more such instances we haven't found yet.
> 
>     -Arun


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel