WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] mem_sharing: summarized problems when domain is dying

To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Subject: [Xen-devel] mem_sharing: summarized problems when domain is dying
From: Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
Date: Fri, 21 Jan 2011 11:19:03 -0500
Cc: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 21 Jan 2011 08:19:44 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=dQZm9Jqbd6KedP+PnuLsKEkNcdJKkhsf/JEK+8yiJ4w=; b=GcUi4Na+7FEDSV/4qs2fWlQZP4VlNCzD+78cyHDwP5pyuC3BQebXDJR02ytTjRz/Q0 0MZ4nK1V2VcD++8AANr5ojlHhnO3ger8E0wjhyUOBjy/TMbyi2ZinZWzFgiyrsZtSeOQ WTHrZr2Lj9hTM7BWMeThsh+d08iVscnX8WKt0=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=fK+soW0LN9vzHX7N+UWpegLtzOI4gshev+bAU7j2YBkn8qXJrw/irPGxtjDGWwLKKS MZVBT2GlqN+ecSR/UqpkFLPa9+RjKCPRF0yr5zQ4sJC24Z6ejayBrIcsQ9AgiV2bzKZi 28qSZ86T8wJzpO1vPY7phuWdbg5mZJAz7kKkQ=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi, Tim:

>From tinnycloud's result, here I summarize the current problem and
findings of mem_sharing due to domain dying.
(1) When domain is dying, alloc_domheap_page() and
set_shared_p2m_entry() would just fail. So the shr_lock is not enough
to ensure that the domain won't die in the middle of mem_sharing code.
As tinnycloud's code shows, is that better to use
rcu_lock_domain_by_id before calling the above two functions?

(2) What's the proper behavior of nominate/share/unshare when domain is dying?
The following is just my current guess. Please give comments as well.

(2.1) nominate: return fail; but needs to check blktap2's code to make
sure it understand and act properly (should be minor issue now)

(2.2) share: return success but skip the gfns of dying domain, i.e.,
we don't remove them from the hash list, and don't update their p2m
entry (set_shared_p2m_entry). We believe that the p2m_teardown will
clean up them later.

(2.3) unshare: it's the most problematic part. Because we are not able
to alloc_domheap_page at this moment, the only thing we can do is
simply skip the page and return. But what's the side effect?
(a) If p2m_teardown comes in, there is no problem. Just destroy it and done.
(b) hap_nested_page_fault: if we return fail, will this cause problem
to guest? or we can simply return success to cheat the guest. But
later the guest will trigger another page fault if it write the page
again.
(c) gnttab_map_grant_ref: this function specify must_succeed to
gfn_to_mfn_unshare(), which would BUG if unshare() fails.

Do we really need (b) and (c) in the last steps of domain dying? If
that's the case, we need to have a special alloc_domheap_page for
dying domain.


On Thu, Jan 20, 2011 at 4:19 AM, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
> At 07:19 +0000 on 20 Jan (1295507976), MaoXiaoyun wrote:
>> Hi:
>>
>>             The latest BUG in mem_sharing_alloc_page from 
>> mem_sharing_unshare_page.
>>             I printed heap info, which shows plenty memory left.
>>             Could domain be NULL during in unshare, or should it be locked 
>> by rcu_lock_domain_by_id ?
>>
>
> 'd' probably isn't NULL; more likely is that the domain is not allowed
> to have any more memory.  You should look at the values of d->max_pages
> and d->tot_pages when the failure happens.
>
> Cheers.
>
> Tim.
>

Bests,
Jui-Hao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel