WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] RE: Xen Scheduler: Credit Scheduler ?

To: "Emmanuel Ackaouy" <ack@xxxxxxxxxxxxx>
Subject: [Xen-users] RE: Xen Scheduler: Credit Scheduler ?
From: "Ott, Donna E" <donna.ott@xxxxxx>
Date: Wed, 22 Nov 2006 13:29:04 -0500
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Thu, 23 Nov 2006 07:49:56 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20061120101745.GA15823@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AccMjSKMyEai57rCTxKecVKbUlfr5gB1s7jg
Thread-topic: Xen Scheduler: Credit Scheduler ?
 
> You need to find out if the VCPUs are blocked in the kernel 
> or runnable but not being scheduled.
> 
> The easiest way to do this is to run 2 spinner processes in 
> the guest after it "stalls".

Well, I did find that if I wait a bit and then hit "ctl-C" and/or type
into the "stalled" domain, it
Will start up again but will never get much CPU time relative to what it
"had" it will
Then complete the benchmark - but with errors- obviously.
> 
> That will tell you if it's the application that has stalled 
> or if it's the guest OS that's runnable but not getting any CPU time.

Could you explain the details?

> 
> Running 3 competing 2vcpu guests on a 2cpu host may cause 
> some interesting problems because while the OS is written to 
> assume that its physical CPUs all exist at the same time, the 
> same is not necessarly true in a virtual environment.
> Your guest OS or benchmark could be timing out due to time 
> outs on spinlocks or something like that.
I have now run them as 1cpu guests as well. (Once again I think it
unlikely that my 
Benchmark is timing out, etc. it's a well known, well used, even by me,
and has NEVER
Behaved this way on other Os's or virtualization software. (that said
anything is possible in software/hw land!!))

> 
> The way to make progress on this is:
> 1- verify that if your vcpus are runnable they run: do this
>    by running spinners on top of ur benchmark or once the
>    benchmark stalls.
Not sure what you mean by this- once the benchmark stalls- it is still
there and typing
Into the domain will make it start  to run again- sort of right where it
had "paused".


> 2- verify that the problem goes away with single CPU guests.
It does NOT go away with single cpu guests- shockingly- it can even
occur with a single
Guest and a large load- say "xm create newguest" - will stall out the
"single guest"

It is particularly easy to see on the first run with the three guests-
or even two.
Just create them,set up the bm, run it in each guest (by hand) and in
moments  a "stall"
Will occur. After the first run, it is harder to get to happen. But the
first time is fairly repeatable.

Though, it does seem to be less prevalent with single guests but it can
STILL happen.


> 3- collect scheduler traces on all CPUs.
Ok, please explain how to do this. I am running out of time to debug
this.
I may soon have to leave this as it is and just go with the results I
have (sadly as I am so impressed with it when it runs well.)


> 
> In general, the best way to deal with SMP guests which have 
> less CPU resources than their number of VCPUs is to "fold"
> the guest down using the CPU hotplug mechanism. There are 
> other alternatives as well that we can look at. Before we do 
> so, let's try to reduce this problem a bit so we can verify 
> if this is or isn't a virtual SMP issue.

Sounds great to me- hope this latest data is helpful. I wish I had more
time!
Cheers
Donna "thankful for what I found that worked well" Ott

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users