[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Domain creation errors



>>> On 28.06.16 at 20:56, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 28/06/16 18:56, Andrew Cooper wrote:
>> <snip>
>>
>> This is the first in a number of changes trying to clean up error reporting 
> of
>> memory conditions.
> 
> One area which is constantly causing problems is creation of domains in
> low memory conditions.  In the case where the toolstack gets its
> calculations wrong, it is very difficult to diagnose what precicely went
> wrong, for a number of reasons.
> 
> The error handling in xc_domain_populate_physmap_exact() looks like
> 
>     if ( err >= 0 )
>     {
>         DPRINTF("Failed allocation for dom %d: %ld extents of order %d\n",
>                 domid, nr_extents, extent_order);
>         errno = EBUSY;
>         err = -1;
>     }
> 
> which hides any error whatsoever behind an -EBUSY.
> 
> The reason for this is that the hypercall interface for
> XENMEM_populate_physmap identifies which extent failed, but not why it
> failed.  (Actually, there are plenty of errors which are accounted
> against "the next extent", rather than being directly caused by that
> extent).
> 
> 
> The error conditions I want to be able to distinguish are "Xen is
> completely out of memory", and "You are trying to add memory to a domain
> beyond its allotted limit".  The former indicates that a toolstacks
> logic to account overall memory is problematic, while the latter
> indicates a mismatch between different toolstack areas.  Sadly, both end
> up as -EBUSY.
> 
> The first problem to solve is that alloc_domheap_pages() can't
> distinguish the errors.  Two returns from it are legitimately an
> indication of ENOMEM, but the assign_pages() failure is specifically
> not.  Returning a struct page_info pointer is unhelpful at
> distinguishing the issues.
> 
> I have thought of two approaches, but its hard to decide between them.
> 
> First is to change alloc_domheap_pages() prototype to
> 
> int alloc_domheap_pages(struct domain *d, unsigned int order, unsigned
> int memflags, struct page_info *out);
> 
> while the second is to use PTR_ERR().
> 
> Using PTR_ERR() is less disruptive to the code, but will cause
> collateral damage for anyone with out-of-tree patches, as the code will
> compile but the error logic will be wrong.  The use of PTR_ERR() is also
> quite dangerous in the context of a PV guest, as the resulting pointer
> is under 64bit guest ABI control.
> 
> I am leaning towards the first option, which at least has the advantage
> that any out-of-tree code will break in an obvious way.
> 
> Any opinions or alternate suggestions?

To be honest I'm not worried much about out of tree code, and
the err.h abstractions are precisely for cases like this. So I'm for
the PTR_ERR() variant.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.