[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] unnecessary VCPU migration happens again



 

> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
> Emmanuel Ackaouy
> Sent: 06 December 2006 14:02
> To: Xu, Anthony
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; xen-ia64-devel
> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
> 
> Hi Anthony.
> 
> Could you send xentrace output for scheduling operations
> in your setup?
> 
> Perhaps we're being a little too aggressive spreading
> work across sockets. We do this on vcpu_wake right now.
> 
> I'm not sure I understand why HVM VCPUs would block
> and wake more often than PV VCPUs though. Can you
> explain?

Whilst I don't know any of the facts of the original poster, I can tell
you why HVM and PV guests have differing number of scheduling
operations... 

Every time you get a IOIO/MMIO vmexit that leads to a qemu-dm
interaction, you'll get a context switch. So for an average IDE block
read/write (for example) on x86, you get 4-5 IOIO intercepts to send the
command to qemu, then an interrupt is sent to the guest to indicate that
the operation is finished, followed by a 256 x 16-bit IO read/write of
the sector content (which is normally just one IOIO intercept unless the
driver is "stupid"). This means around a dozen or so schedule operations
to do one disk IO operation.

The same operation in PV (or using PV driver in HVM guest of course)
would require a single transaction from DomU to Dom0 and back, so only
two schedule operations. 

The same "problem" occurs of course for other hardware devices such as
network, keyboard, mouse, where a transaction consists of more than a
single read or write to a single register. 

--
Mats
> 
> If you could gather some scheduler traces and send
> results, it will give us a good idea of what's going
> on and why. The multi-core support is new and not
> widely tested so it's possible that it is being
> overly aggressive or perhaps even buggy.
> 
> Emmanuel.
> 
> 
> On Fri, Dec 01, 2006 at 06:11:32PM +0800, Xu, Anthony wrote:
> > Emmanue,
> > 
> > I found that unnecessary VCPU migration happens again.
> > 
> > 
> > My environment is,
> > 
> > IPF two sockes, two cores per socket, 1 thread per core.
> > 
> > There are 4 core totally.
> > 
> > There are 3 domain, they are all UP,
> > So there are 3 VCPU totally.
> > 
> > One is domain0,
> > The other two are VTI-domain.
> > 
> > I found there are lots of migrations.
> > 
> > 
> > This is caused by below code segment in function csched_cpu_pick.
> > When I comments this code segment, there is no migration in above 
> > enviroment. 
> > 
> > 
> > 
> > I have a little analysis about this code.
> > 
> > This code handls multi-core and multi-thread, that's very good,
> > If two VCPUS running on LPs which belong to the same core, then the
> > performance
> > is bad, so if there are free LPS, we should let this two 
> VCPUS run on
> > different cores.
> > 
> > This code may work well with para-domain.
> > Because para-domain is seldom blocked,
> > It may be block due to guest call "halt" instruction.
> > This means if a idle VCPU is running on a LP,
> > there is no non-idle VCPU running on this LP.
> > In this evironment, I think below code should work well.
> > 
> > 
> > But in HVM environment, HVM is blocked by IO operation,
> > That is to say, if a idle VCPU is running on a LP, maybe a
> > HVM VCPU is blocked, and HVM VCPU will run on this LP, when
> > it is woken up.
> > In this evironment, below code cause unnecessary migrations.
> > I think this doesn't reach the goal ot this code segment.
> > 
> > In IPF side, migration is time-consuming, so it caused some 
> performance
> > degradation.
> > 
> > 
> > I have a proposal and it may be not good.
> > 
> > We can change the meaning of idle-LP,
> > 
> > Idle-LP means a idle-VCPU is running on this LP, and there 
> is no VCPU
> > blocked on this
> > LP.( if this VCPU is woken up, this VCPU will run on this LP).
> > 
> > 
> > 
> > --Anthony
> > 
> > 
> >         /*
> >          * In multi-core and multi-threaded CPUs, not all 
> idle execution
> >          * vehicles are equal!
> >          *
> >          * We give preference to the idle execution vehicle with the
> > most
> >          * idling neighbours in its grouping. This distributes work
> > across
> >          * distinct cores first and guarantees we don't do something
> > stupid
> >          * like run two VCPUs on co-hyperthreads while 
> there are idle
> > cores
> >          * or sockets.
> >          */
> >         while ( !cpus_empty(cpus) )
> >         {
> >             nxt = first_cpu(cpus);
> > 
> >             if ( csched_idler_compare(cpu, nxt) < 0 )
> >             {
> >                 cpu = nxt;
> >                 cpu_clear(nxt, cpus);
> >             }
> >             else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
> >             }
> >             else
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
> >             }
> > 
> >             ASSERT( !cpu_isset(nxt, cpus) );
> >         }
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.