[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Balloon driver bug in increase_reservation



On Mon, 2013-09-02 at 16:13 +0100, Wei Liu wrote:
> On Mon, Sep 02, 2013 at 04:09:26PM +0100, Ian Campbell wrote:
> > On Mon, 2013-09-02 at 16:04 +0100, Wei Liu wrote:
> > > On Mon, Sep 02, 2013 at 03:48:43PM +0100, Ian Campbell wrote:
> > > > On Mon, 2013-09-02 at 15:43 +0100, Wei Liu wrote:
> > > > > Hi, Stefano
> > > > > 
> > > > > I found another bug in the balloon scratch page code. As I didn't 
> > > > > follow
> > > > > the discussion on scratch page so I cannot propose a proper fix at the
> > > > > moment.
> > > > > 
> > > > > The problem is that in balloon.c:increase_reservation, when a 
> > > > > ballooned
> > > > > page is resued, it can have a valid P2M entry pointing to the scratch,
> > > > > hitting the BUG_ON
> > > > > 
> > > > >    BUG_ON(!xen_feature(XENFEAT_auto_translated_physmap) &&
> > > > >           phys_to_machine_mapping_valid(pfn));
> > > > > 
> > > > > As balloon worker might run by a CPU other then the one that returns 
> > > > > the
> > > > > page, checking pfn_to_mfn(pfn) == local_cpu_scratch_page_mfn wouldn't
> > > > > work.  Checking pfn_to_mfn(pfn) belongs to the set of all scratch page
> > > > > mfns is not desirable.
> > > > 
> > > > This makes me think that whoever suggested that pfn_to_mfn for a
> > > > ballooned page out to return INVALID_MFN was right.
> > > > 
> > > 
> > > If there are many balloon pages the check can be expensive.
> > 
> > IIRC the suggestion was that the p2m for a ballooned out page would
> > contain INVALID_MFN, so the expense is just the lookup you would be
> > doing anyway.
> > 
> 
> That was David's idea. Stefano was worried that other PVOPS hooks would
> need to know about the MFN. I don't know how much it holds true. Need
> some input from Konrad I think.

Hrm, aren't those expected to not be operating on ballooned out pages
anyway? Prior to this change those callers would have got INVALID_MFN, I
think?

My gut feeling is that places which accidentally/unexpectedly have a
ballooned out page in their hand have either a virtual address or an mfn
(e.g. a cr2 fault address) and not a pfn in their hands, and won't in
general be that interested in the pfn and/or are already geared up to
deal with INVALID_MFN because previously that's what they would have
gotten. There's every chance I'm wrong though.

I wonder how migration copes with all this cleverness BTW...

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.