WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Xen Crashes when releasing gnttab mappings - of a crashed do

To: "xen-devel" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Xen Crashes when releasing gnttab mappings - of a crashed domain.
From: "Moffie, Micha" <micha.moffie@xxxxxx>
Date: Tue, 21 Nov 2006 14:09:35 -0000
Delivery-date: Tue, 21 Nov 2006 06:09:56 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C188B3C8.4D26%keir@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AccNc31su++dBHlmEduJYgAX8io7RQAAhe0A
Thread-topic: Xen Crashes when releasing gnttab mappings - of a crashed domain.
Observation:
------------
When connecting two miniOs (using a shared ring), Xen (not a domain)
crashes when the miniOs's exits..

Xen crashes and produces the following: 
(XEN) Xen call trace:
(XEN)    [<ff11d20d>] __bug+0x29/0x45
(XEN)    [<ff107cb3>] gnttab_release_mappings+0xcb/0x2e5
(XEN)    [<ff1046dd>] domain_kill+0x29/0x62
(XEN)    [<ff10349a>] do_domctl+0x6d6/0xfbc
(XEN)    [<ff165755>] hypercall+0x95/0xb5
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 1:
(XEN) BUG at grant_table.c:1122
(XEN) ****************************************


The cause:
----------
Xen tries to release the grant table mappings by accessing a remote
domain grant table. 
But the remote domain seems to be non-existent and consequently Xen
fails:
find_domain_by_id (in gnttab_release_mappings) returns NULL.


Analysis:
---------
This situation described above should never happen: if I understand
correctly, a domain should not be completely destroyed until there are
no more references to it.
See: put_domain(d) // sched.h
Which is defined as follows:
If ( atomic_dec_and_test( &(_d_->refcnt) ) domain_destroy(_d)

It does however happen when a domain crashes.

Note that there are two ways to "finish" with a domain (domain.c):
1.      domain_kill (which calls domain_destroy) - releases all
resources in a gracefull 
      manner.
2.      __domain_crash (which calls domain_shutdown) - which seems to
kill the domain 
      without proper releasing of resources that reference to it.. 
     (this function is called on extreme cases)


Our scenario:
-------------

We are running two miniOs with the same profile:
Open a ring (share a page with a grant ref and map a page from a remote
domain)
Write
Read
Close the ring (dealloc, unmap*)
do_exit()



Timeline - > 
MiniOs 1:  ..........         calls do_exit() -> 
                                     .. domain_kill() -> 
                                            .. gnttab_release_mapping()
-> 
                                                    .. BUG()

MiniOs 2:    crashes**

                 
*When we unmap we use Xen's hypercall to unmap a grant reference 
and the gnttab_unmap_grant_ref structure.
Note that we have a bug and do NOT set unmap_op.dev_bus_addr to 0 as we
should.
Xen's API (in public/grant_table.h) explicitly describes that it should
be 0 or 
the grant reference will be treated as valid device mapping. 

** Because of the bug descrived in * we cause the domain to crash.
We observe:
(XEN) grant_table.c:394: Bad frame number doesn't match gntref
(XEN) mm.c:760: Attempt to implicitly unmap a granted PTE 
(XEN) domain_crash called from mm.c:761 



Summary:
-----------

1. Setting unmap_op.dev_bus_addr removes the BUG and all is well.
2. But crashing Xen - even with our error - doesn't seem to be a healthy
choice.



:) 
Micha.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel