[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Scheduler problem in XEN 3.4.0


  • To: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
  • From: Pankaj Parakh <me.pankajparakh@xxxxxxxxx>
  • Date: Wed, 28 Oct 2009 17:31:14 +0530
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 28 Oct 2009 05:03:02 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=RKwGU6Hef6E2Y/UoDM9vUjRrFZeciyli9sytczr/+KS8bhg5Eh/58q3IuWvsTNQlDp tCYxlMs4mSQ760BIWNR2Rjsxtt5XodPjWwMFAHC/nFurGP5d0vC0GFhCBjmW1aI9l93d ZEf5rnZjMk/xIbeLPypK7iUMczjtLrKGOulyY=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi George,

First of all I want to thank you, I am able to run that rr scheduler,
wake function was the only problem which was stopping it to run, I
still need to improve it. Your review was very very helpful. Also now
I have  two machine serial console setup which let me see printk s in
XEN code.
Now I want to manipulate a parameter of policy say quantum of
timeslice in runtime. I am unable to find much reference for it. I
know that I have to make a entry in domctl.h but what next..
Please guide me through.

Thanks a lot again.

Regards,
Pankaj Parakh.


On Mon, Oct 26, 2009 at 8:56 PM, George Dunlap
<george.dunlap@xxxxxxxxxxxxx> wrote:
>
> Pankaj Parakh wrote:
>>
>> Hi George,
>>
>> Thanks for showing interest, First of all I'll try to setup debugging
>> environment for Xen, I also checked the mailing list of Ananth also I
>> know him personally, he has left it some where in middle.
>>
>> I tried printk's and locking method to find out which function is
>> being called in schedule.c, I found the following sequence:-
>>
>> schedule_init()
>> schedule_domain_init()
>> schedule_vcpu_init()
>> schedule_domain_init()
>> schedule_vcpu_init()
>> (from here no function is called from schedule.c, but system hangs)
>>
>> I cant say what the problem is, I am only changing the sched_priv data
>> of vcpu struct, also my rr sched has only three functions viz
>> init_vcpu, destroy_vcpu, do_schedule.
>>
>> I have attached the code as well.
>>
>> Following are my doubts:
>>
>
> [Language point: In many English dialects, such as US and UK, "doubt" implies 
> something negative.  To avoid being misunderstood by speakers of those 
> dialects, "questions" might be a better word to use.]
>>
>> 1. Is the above function sequence is right ? Why two times init_domain
>> is called during the boot.
>>
>
> I suspect that the idle domain (32727) and domain 0 are the two domains being 
> initialized.  You could easily find out by adding the domain id to the printk.
>>
>> 2. Do a scheduler policy need to manipulate other part of vcpu struct
>> (other than sched_priv)
>>
>
> I don't believe so.  You could look through other schedulers (like the credit 
> scheduler) to see if this is so.
>>
>> 3. Is it necessary to maintain domain information inside scheduler policy
>>
>
> I don't know.
>
> A quick scan through the attached file turn up a couple of points:
> + It appears that the algorithm adds *all* vcpus to a single global runqueue, 
> and scans through them looking for "runnable" vcpus on schedule.  This seems 
> pretty pointless: why not add them to the global runqueue on wake?
> + Furthermore, I'm not sure your linked-list implementation is sound; for 
> example, the initial v->sched_priv is not initialized to NULL. (Not going to 
> spend a lot of time trying to figure out if that's OK or not.)
> + There is some missing logic regarding the v->processor field and 
> sync_vcpu_execstate().  Xen is designed expecting a per-cpu runqueue with 
> explicit migrations of vcpus between pcpus.  One of the reasons for this is 
> so that when switching between a vcpu and the idle domain, it doesn't 
> actually need to do a full context switch.  As a result, if you run a vcpu on 
> one cpu, and then run it on another without an explicit sync, you may have 
> stale data in the vcpu struct (and Xen will throw a BUG).
> + You don't implement a .wake() method.  I think that Xen will wake() dom0 
> when it's ready, expecting the .wake() method to raise a SCHEDULE softirq on 
> the appropriate pcpu (from which the .do_schedule() method is called).  So no 
> wake method means no schedule(), and no schedule means it just runs the idle 
> domain.
>
> It might be best to start with just one pcpu, and adding multiple cpus once 
> you get things running.  Try adding a wake() method that will check to see if 
> cpu 0 is idle; if it is, raise SCHEDULE_SOFTIRQ on that cpu, and see what you 
> get.  After that, try adding some logic to figure out which other cpu to wake 
> instead; but be advised that you'll probably hit the BUG() relating to 
> sync_vcpu_execstate() not being properly called.
>
> -George
>>
>> On Tue, Oct 13, 2009 at 5:17 PM, George Dunlap
>> <George.Dunlap@xxxxxxxxxxxxx> wrote:
>>
>>>
>>> Pankaj,
>>>
>>> I haven't used the round-robin scheduler code in that book, but
>>> another guy named Ananth tried to use it unsuccessfully as well.  You
>>> can see some of that thread here (not sure why I can't find the
>>> original post):
>>>  http://lists.xensource.com/archives/html/xen-devel/2009-05/msg00004.html
>>>
>>> Most of us are not so interested in finding the bug in the books'
>>> code, but we are interested in helping *you* find it.  If you continue
>>> to do hypervisor work (and especially if you do anything with the
>>> scheduler), you're going to have to learn how to debug a hypervisor,
>>> which is often rather a pain in the neck.
>>>
>>> There is some advice in the thread linked to above about setting up a
>>> serial console.  You can add printk()'s around to find out what the
>>> sched_rr code is doing and where it's going wrong, and ask more
>>> questions on the list if you get stuck.  (Feel free to cc me to bring
>>> it to my attention, but always send it to the whole list.)
>>>
>>> Good luck,
>>>  -George
>>>
>>>
>>> On Mon, Oct 12, 2009 at 6:15 PM, Pankaj Parakh
>>> <me.pankajparakh@xxxxxxxxx> wrote:
>>>
>>>>
>>>> Hi All,
>>>>
>>>> I am trying to learn about schedulers of XEN, so for a start I am
>>>> using XEN 3.4.0 and using book - The Definitive Guide to the Xen
>>>> Hypervisor - by David Chisnall, I have followed its steps to add
>>>> scheduler which is there in Chap. 12
>>>> But I dont know what is the problem, I am unable to boot with that
>>>> scheduler selected, I have been trying to debug this problem, but
>>>> kinda stuck in it.
>>>> The scheduler given in that book is a trivial round robin scheduler.
>>>>
>>>> Is the problem is with the code or with the procedure, I dont know that 
>>>> also.
>>>>
>>>> Plzz help me out of this.
>>>>
>>>> Thanks
>>>> Pankaj Parakh
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
>>>>
>>>>
>>
>>
>>
>> --
>> Pankaj Parakh
>>
>



--
Pankaj Parakh

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.