WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Re: [PATCH] [RFC] Credit2 scheduler prototype

To: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] Re: [PATCH] [RFC] Credit2 scheduler prototype
From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Date: Wed, 13 Jan 2010 16:05:05 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 13 Jan 2010 08:05:27 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=atcBuwupZxFxuzVRIIm/sUDg1z9fgh6r01q1StR10Bc=; b=e+wnLhCZYQQ9e7jcsLIhg5WAGRTUMyVdPIbfPNRI37BrSVmRhcuwRW2msZeJ9npdcY FQLAv/XY/pK0GAEhn1Cu/yunizYPv471qPKg0TpGnh8MQcyfamyGU0rdPk/lZux+IRI+ 92kpD8KoZXZfmjJ8RR3HIhTKUIKMnEKIITlFA=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=JzloBSSvnsmxl3FbD9uPxB8iSQp+z3KMf1cymepRgl/2wZwakanf9L7Ls4ehFsDpxS BHL+GmuNqc9+cR20V1ccQrLtVdjG9S21oDioVsmkB7CapdVtw06r6ZcG4/+A/pHqqBIy 6LWlpkvmVroOSBt6o5YVdxnvEe1uvHQgAAHjM=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <C7739463.648C%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <de76405a1001130648u50ccf3ebg3bde1b0c79840366@xxxxxxxxxxxxxx> <C7739463.648C%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, Jan 13, 2010 at 3:16 PM, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:
> On 13/01/2010 14:48, "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote:
>
>> The first implements something like what you suggest below, but
>> instead of using a sort of "hack" with VPF_migrate, it makes a proper
>> "context_saved" SCHED_OP callback.
>
> I thought using the vcpu_migrate() path might work well since you presumably
> have logic there to pick a new cpu which is relatively unloaded, making the
> cpu which tried to schedule the vcpu but had to idle instead a prime
> candidate. So rather than having to implement a new callback hook, you'd get
> to leverage the pick_cpu hook for free?

Hmm, not sure that actually gives us the leverage we need to solve all
the races.  If you look at sched_credit2.c (in the credit2-hypervisor
patch), you'll see I added two flags to the private vcpu struct: one
to indicate that the vcpu has (or may have) context somewhere on a
cpu, and thus can't be added to the runqueue; another to indicate that
when the first flag is cleared, it should be added to the runqueue.
In the current implementation, the first flag is set and cleared every
time a vcpu is scheduled or descheduled, whether it needs to be added
to the runqueue after context_saved() or not.

[NB that the current global lock will eventually be replaced with
per-runqueue locks.]

In particular, one of the races without the first flag looks like this
(brackets indicate physical cpu):
[0] lock cpu0 schedule lock
[0] lock credit2 runqueue lock
[0] Take vX off runqueue; vX->processor == 1
[0] unlock credit2 runqueue lock
[1] vcpu_wake(vX) lock cpu1 schedule lock
[1] finds vX->running false, adds it to the runqueue
[1] unlock cpu1 schedule_lock
[0] vX->running=1
[0] unlock cpu0 schedule lock
[0] lock cpu1 schedule lock (vX->cpu == 1)
[0] vX->cpu = 0
[0] unlock cpu1 schedule lock
[1] takes vX from the runqueue, finds vX->running is true *ERROR*

I guess the real problem here is that vX->running is set even though
the vX->processor schedule lock isn't held, causing a race with
vcpu_wake().  In the other schedulers this can't happen, since it
takes an explicit migrate to change processors.  In the attached
patches, csched2 operations serialize on the runqueue lock, fixing
that particular race.

Can't think of a better solution off the top of my head; I'll give it
some thought.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel