[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] deadlock in the credit2



2011/10/14 Eunbyung Park <silverbottlep@xxxxxxxxx>:
> IMHO, it seems to be deadlock when changing dom0's weight in credit2
> scheduler.
>
> when the sched_adjust() in schedule.c is called, it grabs the
> schedule_lock after pausing all of the vcpus
>
> and then, csched_dom_cntl in sched_credit2.c, it also grab the
> schedule_lock by using vcpu_schedule_lock_irq().
>
> In the credit2, all of the percpu schedule_lock points out same runqueue
> lock if they belong to same runqueue.
>
> Eventually, all of vcpu are paused except for itself running the code,
> and it try to grab schedule_lock that was grabbed by itself.
>
> Am I right? If I was wrong, please tell me my misunderstanding.

Hmm, I think you may have discovered the source of a bug that people
have been reporting but I haven't had time to look into yet.

Keir, I think that lock in schedule.c around SCHED(adjust) must be
wrong.  By the time we grab that lock, grabbing it will be completely
pointless.  What are we going to be racing against?  In any case, the
actual scheduler should be responsible for grabbing locks; there's no
reason that the scheduler can't grab whatever lock it needs inside
that function.  I haven't done a deep analysis, but my initial
instinct is to just get rid of it.  What do you think?

> if ( d == current->domain )
> vcpu_schedule_lock_irq(current);
>
> It was very hard to understan for me..:) What does it exactly mean?

You're asking what "current" means?  "current" is a macro that always
resolves to the vcpu which is running on the current processor.

sched_adjust() seems to be trying to avoid scheduling races in general
by pausing all vcpus before calling the per-scheduler function.  But
if a VM is calling the op on itself, the vcpu making the hypercall
can't pause itself.  So in that case (current->domain == d) will be
true, so sched_adjust() grab the schedule lock of that vm instead.

But really all that locking should be handled in the scheduler
function, not by the generic code.  It knows best what needs to be
locked when.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.