[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] Error restoring DomU when using GPLPV



The problem is that every page that is ballooned down by
the balloon driver can be slurped up as a private-
persistent ("preswap") page by tmem.  Private-persistent
pages contain indirectly-accessible domain data, are counted
against the domain's tot_pages, and are migrated along with
the domain-directly-accessible pages.

So any temporary mapping of xenheap pages into domheap,
such as occurs during restore/migration, can cause max_pages
to be exceeded.

This isn't a problem today for tmem because tmem only runs
in PV domains today, but I suspect the fragileness of this
approach will come back and bite us.  It reminds me
of the classic "shell game".

Is there a per-domain counter of these special pages
somewhere?  If so, a MEMF flag could subtract this
from max_pages in the limit check in assign_pages(),
e.g.:

max = d->max_pages;
if ( memflags & MEMF_no_special )
    max -= d->special_pages;
<snip>
    if ( unlikely((d->tot_pages + ... > max )
        /* Over-allocation */

(Special_pages counts any xenheap pages
that contain domain-specific data that needs
to be retained across a migration.)

Dan

> -----Original Message-----
> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> Sent: Thursday, September 17, 2009 12:21 AM
> To: Mukesh Rathor; Dan Magenheimer
> Cc: Annie Li; Joshua West; James Harper; xen-devel; Wayne Gong; Kurt
> Hackel
> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV
> 
> 
> Yeah, all the PV drivers are having to do is balloon down one 
> page for every
> Xenheap page they map. There's no further complexity than 
> that, so let's not
> make a mountain out of a molehill. The approach as discussed and now
> implemented should work fine with tmem I think.
> 
>  -- Keir
> 
> On 16/09/2009 21:50, "Mukesh Rathor" <mukesh.rathor@xxxxxxxxxx> wrote:
> 
> > just in case someone missed the thread earlier,
> > 
> > 3 = 1 shinfo + 2 gnt frames default.
> > 
> > so, tot_pages + shinfo + num gnt frames.
> > 
> > 
> > Mukesh
> > 
> > 
> > 
> > Dan Magenheimer wrote:
> >> Before we close down this thread, I have a concern:
> >> 
> >> According to Mukesh, the fix to this bug is dependent
> >> on the pv drivers tracking tot_pages for a domain
> >> and ballooning to ensure tot_pages+3 does not exceed
> >> max_pages for the domain.
> >> 
> >> Well, tmem can affect tot_pages for a domain inside
> >> the hypervisor without any notification to pv drivers
> >> or the balloon driver.  And I'd imagine that PoD and
> >> future memory optimization mechanisms such as
> >> swapping and page-sharing may do the same.
> >> 
> >> So this solution seems very fragile.
> >> 
> >> Dan
> >> 
> >>> -----Original Message-----
> >>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
> >>> Sent: Wednesday, September 16, 2009 6:28 AM
> >>> To: Annie Li
> >>> Cc: Joshua West; Dan Magenheimer; xen-devel; Kurt Hackel;
> >>> James Harper;
> >>> Wayne Gong
> >>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV
> >>> 
> >>> 
> >>> On 16/09/2009 12:10, "ANNIE LI" <annie.li@xxxxxxxxxx> wrote:
> >>> 
> >>>>> I will do more test to make sure it and update here.
> >>>> I tried to map 256 grant frames during initialization and
> >>> balloon down
> >>>> 256+1(shinfo+gnttab) pages driver first
> >>>> load. Then i did save/restore for 50 times, and live
> >>> migration for 10
> >>>> times. No error occurs.
> >>> Okay, well I still can't explain why that fixes it, but
> >>> clearly it does. So
> >>> that's good. :-)
> >>> 
> >>>  -- Keir
> >>> 
> >>> 
> >>> 
> >> 
> >> _______________________________________________
> >> Xen-devel mailing list
> >> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >> http://lists.xensource.com/xen-devel
> 
> 
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.