[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] Start PV guest faster



On Tue, 2014-05-20 at 10:30 +0100, Jan Beulich wrote:
> >>> On 20.05.14 at 09:26, <frediano.ziglio@xxxxxxxxxx> wrote:
> > Experimental patch that try to allocate large chunks in order to start
> > PV guest quickly.
> 
> The fundamental idea is certainly welcome.
> 
> > It's a while I noticed that the time to start a large PV guest depends
> > on the amount of memory. For VMs with 64 or more GB of ram the time can
> > become quite significant (like 20 seconds). Digging around I found that
> > a lot of time is spend populating RAM (from a single hypercall made by
> > xenguest).
> 
> Did you check whether - like noticed elsewhere - this is due to
> excessive hypercall preemption/restart? I.e. whether making
> the preemption checks less fine grained helps?
> 

Yes, you are right!

Sorry for late reply, I got some time only now. Doing some tests with a
not so bug machine (3gb) and using strace to see amount of time spent

| Xen preempt check | User allocation | Time all ioctls (sec) |
| yes               | single pages    | 0.262                 |
| no                | single pages    | 0.0612                |
| yes               | bunk of pages   | 0.0325                |
| no                | bunk of pages   | 0.0280                |

So yes, preemption check (I disable entirely for the tests!) is the main
factor. Of course disabling entirely is not the right solution. Are
there some way to understand how often to do, some sort of
computation/timing?


> > The improvement is quite significant (the hypercall is more than 20
> > times faster for a machine with 3GB) however there are different things
> > to consider:
> > - should this optimization be done inside Xen? If the change is just
> > userspace surely this make Xen simpler and safer but on the other way
> > Xen is more aware if is better to allocate big chunks or not
> 
> Except that Xen has no way to tell what "better" here would be.
> 
> > - debug Xen return pages in reverse order while the chunks have to be
> > allocated sequentially. Is this a problem?
> 
> I think the ability to populate guest memory with (largely, but not
> necessarily entirely) discontiguous memory should be retained for
> debugging purposes (see also below).
> 
> > I didn't find any piece of code where superpages is turned on in
> > xc_dom_image but I think that if the number of pages is not multiple of
> > superpages the code allocate a bit less memory for the guest.
> 
> I think that's expected - I wonder whether that code is really in use
> by anyone...
> 
> > @@ -820,9 +831,11 @@ int arch_setup_meminit(struct xc_dom_image *dom)
> >              allocsz = dom->total_pages - i;
> >              if ( allocsz > 1024*1024 )
> >                  allocsz = 1024*1024;
> > -            rc = xc_domain_populate_physmap_exact(
> > -                dom->xch, dom->guest_domid, allocsz,
> > -                0, 0, &dom->p2m_host[i]);
> > +            /* try bit chunk of memory first */
> > +            if ( (allocsz & ((1<<10)-1)) == 0 )
> > +                rc = populate_range(dom, &dom->p2m_host[i], i, 10, 
> > allocsz);
> > +            if ( rc )
> > +                rc = populate_range(dom, &dom->p2m_host[i], i, 0, allocsz);
> 
> So on what basis was 10 chosen here? I wonder whether this
> shouldn't be
> (a) smaller by default,
> (b) configurable (globally or even per guest),
> (c) dependent on the total memory getting assigned to the guest,
> (d) tried with sequentially decreasing order after failure.
> 
> Additionally you're certainly aware that allocation failures lead to
> hypervisor log messages (as today already seen when HVM guests
> can't have their order-18 or order-9 allocations fulfilled). We may
> need to think about ways to suppress these messages for such
> allocations where the caller intends to retry with a smaller order.
> 
> Jan
> 

Well, the patch was mainly a test, 10 was chosen just for testing. No
idea about the best algorithm to follow, surely should handle even not
aligned situations better and surely should decrease the size using some
steps.

Tomorrow I hope to get some time trying multiple chunk sizes to
understand better.

Frediano



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.