WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] RE: The caculation of the credit in credit_scheduler

To: "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] RE: The caculation of the credit in credit_scheduler
From: Keir Fraser <keir@xxxxxxx>
Date: Fri, 05 Nov 2010 08:07:04 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>
Delivery-date: Fri, 05 Nov 2010 01:08:23 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:user-agent:date :subject:from:to:cc:message-id:thread-topic:thread-index:in-reply-to :mime-version:content-type:content-transfer-encoding; bh=pH7bKa/dXACbf5tZM3f1LRI80NqQRnr3X+IXpKzhrxg=; b=Ue6PbjDNepZUIKmGPwc4r6J+mn7Q3JEwBJzvRVF9tr91YgBIHyZSFvJ0WzfuyMjba8 U9F6w7VCPMqq48Mej4d6K+yD6I/glWpy3M1muL2enKz5MOoRvgO0olOpCBc/nUG9C4Mr qITF1irjT+j76N4gJWHHcRKwo3agEhnJnqu+k=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:user-agent:date:subject:from:to:cc:message-id:thread-topic :thread-index:in-reply-to:mime-version:content-type :content-transfer-encoding; b=vbAafvd0gw0IiAser5XIth8hhxrhS5gVnErgl0okM58DanrzJhgHlDVEl6v+wZPAK1 UlSYx6arejwy463l9sTJTMkHLEUokwrLq9LcVwVJBlXBwiwt0G+STinqyFvxo34H+J+w Z8vcVamepqawsNBqOTXkvBD/HyNXUufYOEF5Q=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BC00F5384FCFC9499AF06F92E8B78A9E1C06EC6BF5@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Act8t/OXZ6bLDSZ9SFmoF+2ChyNA1QAATszAAAHP70k=
Thread-topic: [Xen-devel] RE: The caculation of the credit in credit_scheduler
User-agent: Microsoft-Entourage/12.27.0.100910
On 05/11/2010 07:26, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx> wrote:

> Maybe idlers shouldn't produce the credits at the calcuation points.  I did an
> experiment before, it can reduce the unfaireness if idlers not producing
> credit. 
> 
> Except this issue, I also have findings and want to share them with you guys
> to get more input about credit scheduler.
> 
> 1. Interrupt delivery for assiged devices is done in a tasklet and the tasklet
> is running in the idle vcpu's context, but scheduler's behavior for scheduling
> idle vcpu looks very strange. Ideally, when switch to idle vcpu for executing
> tasklet, the previous vcpu should be switch back after tasklet is done, but
> current policy is to choose another vcpu in runq.  That is to say, one
> interrupt happens on one CPU, the CPU may do a real task switch, it maybe not
> acceptable when interrupt frequency is high and also introduce some
> performance bugs according to our experiments.  Even if we can switch back the
> previous vcpu after executing tasklet, how to determine its timeslice for its
> next run is also a key issue and this is not addressed. If still give 30ms for
> its restart run, it may trigger some fairness issues, I think.

Interrupt delivery is a victim of us switching tasklet implementation to
work in idle VCPU context instead of in softirq context. It might be
sensible to make use of softirqs directly from the interrupt-delivery logic,
or introduce a second type of tasklets (built on softirqs), or perhaps we
can think of a way to structure interrupt delivery that doesn't need softirq
context at all -- that would be nice! What did we need softirq context for
in the first place?

 -- Keir

> 2.  Another issue is found during our experiments and this is a very
> interesting issue(likely to be a bug).  In the experiment, we pinned three
> guests(two cpu-intensive and one IO-intensive) on two logical processors
> firstly, and each guest is configured with two virtual CPUs, and the CPU
> utilization share is ~90% for each CPU intensive guest and ~20% for
> IO-intensive guest.  But the magic thing happens after we introducing an
> addition idle guest which doesn't do real worload and just does idle.  The CPU
> utilization share is changed : ~50% for each CPU-intensive guest and ~100% for
> the IO-intensive  guest.  After analying the scheduling data, we found the
> change is from virtual timer interrupt delivery to the idle guest. Although
> the guest is idle, but there are still 1000 timer interrupts for each vcpu in
> one second. Current credit scheduler will boost the idle vcpu from the blocked
> state and trigger 1000 schedule events in the target physical processor, and
> the IO-intensive guest maybe benefit from the frequent schedule events and get
> more CPU utilization share.  The more magic thing is that after 'xm pause' and
> 'xm unpause' the idle guest,  the each of the three guests are all allocated
> with ~66% CPU share.
> This finding tells us some facts:  (1)  current credit scheduler is not fair
> to IO-intensive guests. (2) IO-intensive guests have the ability to acquire
> fair CPU share when competing with CPU-intensive guests. (3) Current timeslice
> (30ms) is meaningless, since the average timeslice is far smaller than 1ms
> under real workloads(This may bring performance issues). (4) boost mechanism
> is too aggressive and idle guest shouldn't be boosted when it is waken from
> halt state.  (5)  There is no policy in credit to determine how
> long the boosted vcpu can run ,and how to handle the preempted vcpu .
> 
> 3.  Credit is not really used for determining key scheduling policies. For
> example, when choose candidate task, credit is not well used to evaluate
> tasks' priority, and this maybe not fair to IO-intensive guest. Additionally,
> task's priority is not caculated in time and just is updated every 30ms. In
> this case, even if one task's credit is minus, its prioirty maybe still
> TS_UNDER or TS_BOOST due to delayed update, so maybe when the vcpu is
> scheduled out, its priority should be updated after credit change.  In
> addition, when a boosted vCPU is scheduled out, its priority is always set to
> TS_UNDER, and credit is not considered as well. If the credit becomes minus,
> it maybe better to set the priority to TS_OVER?.
> 
> Any comments ? 
> 
> Xiantao
> 
> 
> Jiang, Yunhong wrote:
>> When reading the credit scheduler code and doing experiment, I notice
>> one thing interesting in current credit scheduler. For example, in
>> following situation:
>> 
>> Hardware:
>> A powerful system with 64 CPUs.
>> 
>> Xen Environment:
>> Dom0 with 8 vCPU bound to CPU (0, 16~24)
>> 
>> 3 HVM domain, all with 2 vCPUS, all bound as vcpu0->pcpu1,
>> vcpu1->pcpu2. Among them, 2 are CPU intensive while 1 is I/O
>> intensive.  
>> 
>> The result shows that the I/O intensive domain will occupy more than
>> 100% cpu, while the two cpu intensive domain each occupy 50%.
>> 
>> IMHO it should be 66% for all domain.
>> 
>> The reason is how the credit is caculated. Although the 3 HVM domains
>> is pinned to 2 PCPU and share the 2 CPUs, they will all get 2* 300
>> credit when credit account. That means the I/O intensive HVM domain
>> will never be under credit, thus it will preempt the CPU intensive
>> whenever it is boost (i.e. after I/O access to QEMU), and it is set
>> to be TS_UNDER only at the tick time, and then, boost again.
>> 
>> I'm not sure if this is meaningful usage model and need fix, but I
>> think it is helpful to show this to the list.
>> 
>> I didn't try credit2, so no idea if this will happen to credit2 also.
>> 
>> Thanks
>> --jyh
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel