[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xl only waits 33 seconds for ballooning to complete

CCing Stefano, who was involved in the original xl ballooning stuff and
the other toolstack maintainers.

On Wed, 2015-01-07 at 18:11 -0700, Mike Latimer wrote:
> On Wednesday, January 07, 2015 09:38:31 AM Ian Campbell wrote:
> > That's exactly what I was about to suggest as I read the penultimate
> > paragraph, i.e. keep waiting so long as some reasonable delta occurs on
> > each iteration.
> Thanks, Ian.
> I wonder if there is a future-safe threshold on the amount of delta that 
> indicates progress is being made. Should some minimum safe progress amount or 
> percentage be set, or is it better to just make sure free memory is 
> increasing 
> at the end of each iteration of the loop?

I'm not sure. It seems like the balloon ought to be able to make *some*
progress over the course of 10s, even on a heavily loaded system.

The reason for my uncertainty is that there is a certain amount of noise
in the amount of current free memory in a system, since backend driver
operation (at least with some kernels) causes it to fluctuate a bit as
things are grant mapped unmapped. 

So using free_memkb as you have might take a bit of care.

I'm more inclined to suggest you'd be better off checking that dom0's
actual allocation is shrinking, but that might be subject to the same
noise (I'm not sure if/how grants are accounted to a guest...).

Which sort of leads me towards thinking that the loop should tolerate
slight *increases* in dom0's allocation, but that simply can't be right!

> For example, the following simple change just tracks free_memkb and only 
> decrements the retry count if it has not increased since the last check:
> ----------------------
> diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
> index ed0d478..4cf2991 100644
> --- a/tools/libxl/xl_cmdimpl.c
> +++ b/tools/libxl/xl_cmdimpl.c
> @@ -2196,7 +2196,7 @@ static int preserve_domain(uint32_t *r_domid, 
> libxl_event *event,
>  static int freemem(uint32_t domid, libxl_domain_build_info *b_info)
>  {
>      int rc, retries = 3;
> -    uint32_t need_memkb, free_memkb;
> +    uint32_t need_memkb, free_memkb, free_memkb_prev = 0;
>      if (!autoballoon)
>          return 0;
> @@ -2229,7 +2229,10 @@ static int freemem(uint32_t domid, 
> libxl_domain_build_info *b_info)
>          if (rc < 0)
>              return rc;
> -        retries--;
> +        /* only decrement retry count if free_memkb is not increasing */
> +        if (free_memkb <= free_memkb_prev)
> +            retries--;
> +        free_memkb_prev = free_memkb;
>      } while (retries > 0);
>      return ERROR_NOMEM;
> ----------------------
> I'm not sure if the above approach is always safe, but it works in my 
> testing. 
> I'd appreciate any other thoughts you might have before I try submitting an 
> official patch for this...
> Thanks,
> Mike

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.