[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] VCPU migration overhead conpensation ?
Hi, Dario and George: Thanks for your reply. I am doing the measurement within Xen. I agree that a more precise description would be " the time spent performing scheduling and context switch ". and it does NOT consider cache. (I think the cache effect should be measured in the application level. not in hypervisor.)
Basically I add code in trace.h, schedule.c, and domain.c (for context_switch). I measure three time here: 1. The time spent in making scheduling decisions. defined as scheduling latency. It includesÂgrabbingÂthe spinlock, call the scheduler dependent do_scheduler() function, and release the spinlock;
2. If the scheduler make a decision to perform a context switch (prev != next), it will call context_switch() in domain.c. I measure the time spent, right before the context_saved(prev); 3. The time spent in context_saved(prev), which is to insert the current VCPU back to runq. If a global queue is used and protected by a spinlock, this would include the time to grab the lock and so on.
I ran the experiments on a i7 6 core machine, hyper-thread disabled, speedstep disabled. It has one socket. each core has dedicated L1 and L2 cache, and all of them share L3 cache. Host is CentOS 6.3 with Linux 3.4.35, Xen is 4.1.4.
Domain 0 boots with one VCPU and pinned to core 0. one guest VM boots with 5 VCPUs, and the cpumask set for the remaining 5 VCPUs. I am using a gedf scheduler which I am working on. it isÂessentiallyÂa sedf scheduler sharing one global queue.
In the guest VM, I just ran 5 cpu busy loops to make all the VCPUs busy. The data is get via xentrace with -M option, and I measured it for 10 seconds. The numbers are scheduler dependent, but the absolute number spent in context_switch() should be scheduler independent. And in my measurement, the worst case happens when we do context switch from a IDLE VCPU to a busy VCPU (which is scheduled previously on a different core), and perform the tlb_flush. It is around 2 microseconds on my machine.
I am attaching the cdf plots of these three measurements. Any feedback is welcome. Thanks. Sisu On Wed, Apr 10, 2013 at 3:25 AM, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: On mer, 2013-04-10 at 09:21 +0100, George Dunlap wrote Sisu Xi, PhD Candidate http://www.cse.wustl.edu/~xis/ Department of Computer Science and Engineering Campus Box 1045 Washington University in St. Louis One Brookings Drive St. Louis, MO 63130 Attachment:
gedf_one_nopin_context_saved.eps Attachment:
gedf_one_nopin_context_switch.eps Attachment:
gedf_one_nopin_sched_latency.eps _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |