Xen project Mailing List

Re: [Xen-devel] [PATCH v3] altp2m: Allow the hostp2m to be shared

To: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Wed, 25 May 2016 17:08:59 +0100

Cc: Kevin Tian <kevin.tian@xxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 25 May 2016 16:13:45 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, May 25, 2016 at 4:31 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> wrote: > > On May 25, 2016 05:27, "George Dunlap" <george.dunlap@xxxxxxxxxx> wrote: >> >> On Fri, Apr 29, 2016 at 6:42 PM, Tamas K Lengyel <tamas@xxxxxxxxxxxxx> >> wrote: >> > Don't propagate altp2m changes from ept_set_entry for memshare as >> > memshare >> > already has the lock. We call altp2m propagate changes once memshare >> > successfully finishes. Allow the hostp2m entries to be of type >> > p2m_ram_shared when applying mem_access. Also, do not trigger PoD for >> > hostp2m >> > when setting altp2m mem_access to be in-line with non-altp2m mem_access >> > path. >> >> Hey Tamas, >> >> Sorry for the long delay in getting back to you on this. > > No problem, thanks for taking a closer look! > >> >> So the main issue here (correct me if I'm wrong) is the locking >> discipline: namely, men_sharing_share_pages(): >> - Grabs the hostp2m lock >> - Grabs the appropriate domain memsharing locks >> - Calls set_shared_p2m_entry(), which ends up calling ept_set_entry(), >> which (when altp2m is active) grabs the altp2mlist and altp2m locks. >> >> This causes an ASSERT(), since the altp2mlist lock is ahead of the >> memsharing locks in the list. >> >> But having taken a closer look at the code, I'm not sure the change is >> quite correct. Please correct me if I've misread something: >> >> mem_sharing_share_pages() is passed two <domain,gfn> pairs -- the >> <sd,sgfn> (which I assume stands for "shared gfn") and <cd,cgfn> >> (which I assume stands for "copy"); and it > > Here s/c stands for source/client. > >> 1) Looks up smfn and cmfn, which back sgfn and cmfn respectively >> 2) Looks up cmfn, which backs cgfn then replaces all gfn entries which >> point to cmfn with smfn (updating accounting as appropriate) > > Hm, I might have missed that. Where does it do the lookup for all other > cgfns backed by this cmfn? I was looking at the loop in the middle of the function: while ( (gfn = rmap_iterate(cpage, &ri)) != NULL) { ... } I haven't chased it down, but it looks like this walks the reverse map of all gfns which map cpage; and for each such gfn it finds it: * removes the cpage -> gfn rmap * Adds an spage -> gfn map * Reduces the type count of cpage * Sets the p2m entry for that gfn to the smfn (rather than cmfn). Obviously the common case is that the number of mappings is exactly 1; but we need to either ensure that this is always true, or we need to handle the case where it's not true. :-) >> But this change will only call p2m_altp2m_propagate_change() for the >> original cgfn -- any other gfns which are backed by cmfn will not have >> the corresponding altp2m entries propagated properly. > > Right, if there is some other place where it does sharing in the back we > would have to propagate that change. > >> This sort of mistake is easy to make, which is why I think we should >> try to always update the altp2ms in ept_set_entry() if we can, to >> minimize the opportunity for making this sort of mistake. >> >> Is there ever a reason to grab the altp2m lock and *then* grab the >> sharing lock? Could we just move the sharing lock up between the p2m >> lock and the altp2mlist lock instead? >> > > I can't think of a scenario where we would get to sharing from altp2m with > altp2m locking first. Not sure what you mean by moving the sharing lock up > though. The problem is that sharing already has the lock by the time altp2m > tries to lock, so we could pass that info down to make altp2m aware it needs > no locking. It would require extending a bunch of functions though with an > extra input that is barely ever used.. If you have altp2m there are three locks. There's one p2m lock for the "host" p2m (that is, Xen's idea of what the mapping should look like). Then there's the altp2mlist lock, which protects the *list* of altp2ms; then each altp2m itself has its own lock. These are defined in mm-lock.h and must be grabbed in that order: p2m before altp2mlist, altp2mlist before altp2m. I assume that the memsharing code is grabbing the hostp2m lock (it should be anyway), then grabbing the memsharing locks. This is allowed because the memsharing locks are defined after the p2m lock in mm-lock.h. But then when updating the p2m entry, if you have an altp2m active, it then tries to grab the altp2mlist lock so it can iterate over the altp2ms. Since the altp2mlist lock is *before* the sharing lock in mm-lock.h, this triggers an assert. Is that not what your issue is? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.