[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Proposed XENMEM_claim_pages hypercall: Analysis of problem and alternate solutions



Hello,
On Dec 20, 2012, at 11:04 AM, Tim Deegan <tim@xxxxxxx> wrote:

> Hi,
> 
> At 17:17 -0500 on 18 Dec (1355851071), Konrad Rzeszutek Wilk wrote:
>> In essence, the max_pages does work - _if_ one does these operations
>> in serial. We are trying to make this work in parallel and without
>> any failures - for that we - one way that is quite simplistic
>> is the claim hypercall. It sets up a 'stake' of the amount of
>> memory that the hypervisor should reserve. This way other
>> guests creations/ ballonning do not infringe on the 'claimed' amount.
>> 
>> I believe with this hypercall the Xapi can be made to do its operations
>> in parallel as well.
> 
> The question of starting VMs in parallel seems like a red herring to me:
> - TTBOMK Xapi already can start VMs in parallel.  Since it knows what
>  constraints it's placed on existing VMs and what VMs it's currently
>  building, there is nothing stopping it.  Indeed, AFAICS any toolstack
>  that can guarantee enough RAM to build one VM at a time could do the
>  same for multiple parallel builds with a bit of bookkeeping.
> - Dan's stated problem (failure during VM build in the presence of
>  unconstrained guest-controlled allocations) happens even if there is
>  only one VM being created.
> 
>>>> Andres Lagar-Cavilla says "... this is because of shortcomings in the
>>>> [Xen] mm layer and its interaction with wait queues, documented
>>>> elsewhere."  In other words, this batching proposal requires
>>>> significant changes to the hypervisor, which I think we
>>>> all agreed we were trying to avoid.
>>> 
>>> Let me nip this at the bud. I use page sharing and other techniques in an 
>>> environment that doesn't use Citrix's DMC, nor is focused only on 
>>> proprietary kernels...
>> 
>> I believe Dan is saying is that it is not enabled by default.
>> Meaning it does not get executed in by /etc/init.d/xencommons and
>> as such it never gets run (or does it now?) - unless one knows
>> about it - or it is enabled by default in a product. But perhaps
>> we are both mistaken? Is it enabled by default now on xen-unstable?
> 
> I think the point Dan was trying to make is that if you use page-sharing
> to do overcommit, you can end up with the same problem that self-balloon
> has: guest activity might consume all your RAM while you're trying to
> build a new VM.
> 
> That could be fixed by a 'further hypervisor change' (constraining the
> total amount of free memory that CoW unsharing can consume).  I suspect
> that it can also be resolved by using d->max_pages on each shared-memory
> VM to put a limit on how much memory they can (severally) consume.

To be completely clear. I don't think we need a separate allocation/list of 
pages/foo to absorb CoW hits. I think the solution is using d->max_pages. 
Sharing will hit that limit and then send a notification via the "sharing" 
(which is actually an enomem) men event ring.

Andres
> 
>> Just as a summary as this is getting to be a long thread - my
>> understanding has been that the hypervisor is suppose to toolstack
>> independent.
> 
> Let's keep calm.  If people were arguing "xl (or xapi) doesn't need this
> so we shouldn't do it" that would certainly be wrong, but I don't think
> that's the case.  At least I certainly hope not!
> 
> The discussion ought to be around the actual problem, which is (as far
> as I can see) that in a system where guests are ballooning without
> limits, VM creation failure can happen after a long delay.  In
> particular it is the delay that is the problem, rather than the failure.
> Some solutions that have been proposed so far:
> - don't do that, it's silly (possibly true but not helpful);
> - this reservation hypercall, to pull the failure forward;
> - make allocation faster to avoid the delay (a good idea anyway,
>   but can it be made fast enough?);
> - use max_pages or similar to stop other VMs using all of RAM.
> 
> My own position remains that I can live with the reservation hypercall,
> as long as it's properly done - including handling PV 32-bit and PV
> superpage guests.
> 
> Cheers,
> 
> Tim.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.