WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] scheduler rate controller

To: George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <raistlin@xxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] scheduler rate controller
From: "Lv, Hui" <hui.lv@xxxxxxxxx>
Date: Sat, 29 Oct 2011 10:05:15 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "Tian, Kevin" <kevin.tian@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Keir \(Xen.org\)" <keir@xxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, "Duan, Jiangang" <jiangang.duan@xxxxxxxxx>
Delivery-date: Fri, 28 Oct 2011 19:06:20 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1319818714.21033.414.camel@elijah>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C10D3FB0CD45994C8A51FEC1227CE22F340768D793@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <CAFLBxZZ9nqeb7CVqTZCsEtJRjgGMTHF2Ak929kvauj2KUFSOyg@xxxxxxxxxxxxxx> <C10D3FB0CD45994C8A51FEC1227CE22F3428CB5EF9@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <1319789425.19320.12.camel@Abyss> <C10D3FB0CD45994C8A51FEC1227CE22F3428CB61F2@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <1319796584.19320.31.camel@Abyss> <1319818714.21033.414.camel@elijah>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcyVjEuhgC8vIZe3QEasI9kRnEiSfgASQIQQ
Thread-topic: [Xen-devel] [PATCH] scheduler rate controller
I have tried one way very similar as your idea.
1) to check whether current running vcpu runs less than 1ms, if yes, we will 
return current vcpu directly without preemption.
It try to guarantee vcpu to run as long as 1ms, if it wants.
It can reduce the scheduling frequency to some degree, but not very 
significant. Because 1ms is too light/weak with comparison to 10ms delay (SRC 
patch used).

As you said, if applying the seveal_ms_delay, it will happen whenever system is 
normal or not (excessive frequency). It may possible have the consequence that 
1)under normal condition, it will produce worse Qos than that without applying 
such delay, 2) under excessive frequency condition, the mitigation effect of 
1ms-delay may be too weak. In addition, your idea is to delay scheduling 
instead of reducing, which means the total number of scheduling would probably 
not change.
I think one possible solution, is to make the value of 1ms-delay adaptive 
according to the system status (low load or high load). If so, SRC patch just 
covered the excessive condition currently :). That's why I mentioned to treat 
normal and excessive conditions separately and don't influence the normal case 
as much as possible. Because we never know the consequence without amount of 
testing work. :)

Some of my stupid thinking :)

Best regards,

Lv, Hui


-----Original Message-----
From: George Dunlap [mailto:george.dunlap@xxxxxxxxxx] 
Sent: Saturday, October 29, 2011 12:19 AM
To: Dario Faggioli
Cc: Lv, Hui; George Dunlap; Duan, Jiangang; Tian, Kevin; 
xen-devel@xxxxxxxxxxxxxxxxxxx; Keir (Xen.org); Dong, Eddie
Subject: RE: [Xen-devel] [PATCH] scheduler rate controller

On Fri, 2011-10-28 at 11:09 +0100, Dario Faggioli wrote:
> Not sure yet, I can imagine it's tricky and I need to dig a bit more 
> in the code, but I'll let know if I found a way of doing that...

There are lots of reasons why the SCHEDULE_SOFTIRQ gets raised.  But I think we 
want to focus on the scheduler itself raising it as a result of the .wake() 
callback.  Whether the .wake() happens as a result of a HW interrupt or 
something else, I don't think really matters.

Dario and Hui,  neither of you have commented on my idea, which is simply don't 
preempt a VM if it has run for less than some amount of time (say, 500us or 
1ms).  If a higher-priority VM is woken up, see how long the current VM has 
run.  If it's less than 1ms, set a 1ms timer and call schedule() then.

> > > More generally speaking, I see how this feature can be useful, and 
> > > I also think it could live in the generic schedule.c code, but (as 
> > > George was saying) the algorithm by which rate-limiting is 
> > > happening needs to be well known, documented and exposed to the 
> > > user (more than by means of a couple of perf-counters).
> > > 
> > 
> > One question is that, what is the right palace to document such 
> > information? I'd like to make it as clear as possible to the users.
> > 
> Well, don't know, maybe a WARN (a WARN_ONCE alike thing would probably 
> be better), or in general something that leave a footstep in the logs, 
> so that one can find out by means of `xl dmesg' or related. Obviously, 
> I'm not suggesting of printk-ing each suppressed schedule invocation, 
> or the overhead would get even worse... :-P
> 
> I'm thinking of something that happens the very first time the 
> limiting fires, or maybe oncee some period/number of suppressions, 
> just to remind the user that he's getting weird behaviour because 
> _he_enabled_ rate-limiting. Hopefully, that might also be useful for 
> the user itself to fine tune the limiting parameters, although I think 
> the perf-counters are already quite well suited for this.

As much as possible, we want the system to Just Work.  Under normal 
circumstances it wouldn't be too unusual for a VM to have a several-ms delay 
between receiving a physical interrupt and being scheduled; I think that if the 
1ms delay works, having it on all the time would probably be the best solution. 
 That's another reason I'm in favor of trying it -- it's simple and easy to 
understand, and doesn't require detecting when to "turn it on".

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel