Xen project Mailing List

Re: [Xen-devel] [RFC PATCH] Start PV guest faster

From: Frediano Ziglio <frediano.ziglio@xxxxxxxxxx>

Date: Thu, 29 May 2014 18:24:24 +0100

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>

Delivery-date: Thu, 29 May 2014 17:24:52 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Tue, 2014-05-20 at 10:30 +0100, Jan Beulich wrote: > >>> On 20.05.14 at 09:26, <frediano.ziglio@xxxxxxxxxx> wrote: > > Experimental patch that try to allocate large chunks in order to start > > PV guest quickly. > > The fundamental idea is certainly welcome. > > > It's a while I noticed that the time to start a large PV guest depends > > on the amount of memory. For VMs with 64 or more GB of ram the time can > > become quite significant (like 20 seconds). Digging around I found that > > a lot of time is spend populating RAM (from a single hypercall made by > > xenguest). > > Did you check whether - like noticed elsewhere - this is due to > excessive hypercall preemption/restart? I.e. whether making > the preemption checks less fine grained helps? > Yes, you are right! Sorry for late reply, I got some time only now. Doing some tests with a not so bug machine (3gb) and using strace to see amount of time spent | Xen preempt check | User allocation | Time all ioctls (sec) | | yes | single pages | 0.262 | | no | single pages | 0.0612 | | yes | bunk of pages | 0.0325 | | no | bunk of pages | 0.0280 | So yes, preemption check (I disable entirely for the tests!) is the main factor. Of course disabling entirely is not the right solution. Are there some way to understand how often to do, some sort of computation/timing? > > The improvement is quite significant (the hypercall is more than 20 > > times faster for a machine with 3GB) however there are different things > > to consider: > > - should this optimization be done inside Xen? If the change is just > > userspace surely this make Xen simpler and safer but on the other way > > Xen is more aware if is better to allocate big chunks or not > > Except that Xen has no way to tell what "better" here would be. > > > - debug Xen return pages in reverse order while the chunks have to be > > allocated sequentially. Is this a problem? > > I think the ability to populate guest memory with (largely, but not > necessarily entirely) discontiguous memory should be retained for > debugging purposes (see also below). > > > I didn't find any piece of code where superpages is turned on in > > xc_dom_image but I think that if the number of pages is not multiple of > > superpages the code allocate a bit less memory for the guest. > > I think that's expected - I wonder whether that code is really in use > by anyone... > > > @@ -820,9 +831,11 @@ int arch_setup_meminit(struct xc_dom_image *dom) > > allocsz = dom->total_pages - i; > > if ( allocsz > 1024*1024 ) > > allocsz = 1024*1024; > > - rc = xc_domain_populate_physmap_exact( > > - dom->xch, dom->guest_domid, allocsz, > > - 0, 0, &dom->p2m_host[i]); > > + /* try bit chunk of memory first */ > > + if ( (allocsz & ((1<<10)-1)) == 0 ) > > + rc = populate_range(dom, &dom->p2m_host[i], i, 10, > > allocsz); > > + if ( rc ) > > + rc = populate_range(dom, &dom->p2m_host[i], i, 0, allocsz); > > So on what basis was 10 chosen here? I wonder whether this > shouldn't be > (a) smaller by default, > (b) configurable (globally or even per guest), > (c) dependent on the total memory getting assigned to the guest, > (d) tried with sequentially decreasing order after failure. > > Additionally you're certainly aware that allocation failures lead to > hypervisor log messages (as today already seen when HVM guests > can't have their order-18 or order-9 allocations fulfilled). We may > need to think about ways to suppress these messages for such > allocations where the caller intends to retry with a smaller order. > > Jan > Well, the patch was mainly a test, 10 was chosen just for testing. No idea about the best algorithm to follow, surely should handle even not aligned situations better and surely should decrease the size using some steps. Tomorrow I hope to get some time trying multiple chunk sizes to understand better. Frediano _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.