|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] deadlock in the credit2
On 14/10/2011 12:47, "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote:
> 2011/10/14 Eunbyung Park <silverbottlep@xxxxxxxxx>:
>> IMHO, it seems to be deadlock when changing dom0's weight in credit2
>> scheduler.
>>
>> when the sched_adjust() in schedule.c is called, it grabs the
>> schedule_lock after pausing all of the vcpus
>>
>> and then, csched_dom_cntl in sched_credit2.c, it also grab the
>> schedule_lock by using vcpu_schedule_lock_irq().
>>
>> In the credit2, all of the percpu schedule_lock points out same runqueue
>> lock if they belong to same runqueue.
>>
>> Eventually, all of vcpu are paused except for itself running the code,
>> and it try to grab schedule_lock that was grabbed by itself.
>>
>> Am I right? If I was wrong, please tell me my misunderstanding.
>
> Hmm, I think you may have discovered the source of a bug that people
> have been reporting but I haven't had time to look into yet.
>
> Keir, I think that lock in schedule.c around SCHED(adjust) must be
> wrong. By the time we grab that lock, grabbing it will be completely
> pointless. What are we going to be racing against? In any case, the
> actual scheduler should be responsible for grabbing locks; there's no
> reason that the scheduler can't grab whatever lock it needs inside
> that function. I haven't done a deep analysis, but my initial
> instinct is to just get rid of it. What do you think?
Fine by me. The synchronisation in that function looks pretty fragile. It's
probably outdated too.
-- Keir
>> if ( d == current->domain )
>> vcpu_schedule_lock_irq(current);
>>
>> It was very hard to understan for me..:) What does it exactly mean?
>
> You're asking what "current" means? "current" is a macro that always
> resolves to the vcpu which is running on the current processor.
>
> sched_adjust() seems to be trying to avoid scheduling races in general
> by pausing all vcpus before calling the per-scheduler function. But
> if a VM is calling the op on itself, the vcpu making the hypercall
> can't pause itself. So in that case (current->domain == d) will be
> true, so sched_adjust() grab the schedule lock of that vm instead.
>
> But really all that locking should be handled in the scheduler
> function, not by the generic code. It knows best what needs to be
> locked when.
>
> -George
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|