Xen project Mailing List

On Mon, May 11, 2015 at 1:57 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:

>>> On 11.05.15 at 00:04, <lichong659@xxxxxxxxx> wrote:
> On Fri, May 8, 2015 at 2:49 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>> >>> On 07.05.15 at 19:05, <lichong659@xxxxxxxxx> wrote:
>> > @@ -1110,6 +1113,67 @@ rt_dom_cntl(
>> >Â Â Â Â Â }
>> >Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>> >Â Â Â Â Â break;
>> > +Â Â case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
>> > +Â Â Â Â op->u.rtds.nr_vcpus = 0;
>> > +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
>> > +Â Â Â Â list_for_each( iter, &sdom->vcpu )
>> > +Â Â Â Â Â Â vcpu_index++;
>> > +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>> > +Â Â Â Â op->u.rtds.nr_vcpus = vcpu_index;
>>
>> Does dropping of the lock here and re-acquiring it below really work
>> race free?
>>
>
> Here, the lock is used in the same way as the ones in the two cases
> above (XEN_DOMCTL_SCHEDOP_get/putinfo). So I think if race free
> is guaranteed in that two cases, the lock in this case works race free
> as well.

No - the difference is that in the {get,put}info cases it is being
acquired just once each.

I see. I changed it based on Dario's suggestions.Â

>> > +Â Â Â Â vcpu_index = 0;
>> > +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
>> > +Â Â Â Â list_for_each( iter, &sdom->vcpu )
>> > +Â Â Â Â {
>> > +Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu,
>> sdom_elem);
>> > +
>> > +Â Â Â Â Â Â local_sched[vcpu_index].budget = svc->budget / MICROSECS(1);
>> > +Â Â Â Â Â Â local_sched[vcpu_index].period = svc->period / MICROSECS(1);
>> > +Â Â Â Â Â Â local_sched[vcpu_index].index = vcpu_index;
>>
>> What use is this index to the caller? I think you rather want to tell it
>> the vCPU number. That's especially also taking the use case of a
>> get/set pair into account - unless you tell me that these indexes can
>> never change, the indexes passed back into the set operation would
>> risk to have become stale by the time the hypervisor processes the
>> request.
>>
>
> I don't quite understand what the "stale" means. The array here
> (local_sched[ ])
> and the array (in libxc) that local_sched[ ] is copied to are both used for
> this get
> operation only. When users set per-vcpu parameters, there are also
> dedicated
> arrays for that set operation.

Just clarify this for me (and maybe yourself): Is the vCPU number
<-> vcpu_index mapping invariable for the lifetime of a domain?
If it isn't, the vCPU for a particular vcpu_index during a "get"
may be different from that for the same vcpu_index during a
subsequent "set".

Here the vcpu_index means the vcpu_id. I'll use svc->vcpu.vcpu_id instead of the

vcpu_index in next version.

>> > +Â Â Â Â if( local_sched == NULL )
>> > +Â Â Â Â {
>> > +Â Â Â Â Â Â return -ENOMEM;
>> > +Â Â Â Â }
>> > +Â Â Â Â copy_from_guest(local_sched, op->u.rtds.vcpus,
>> op->u.rtds.nr_vcpus);
>> > +
>> > +Â Â Â Â for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
>> > +Â Â Â Â {
>> > +Â Â Â Â Â Â vcpu_index = 0;
>> > +Â Â Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
>> > +Â Â Â Â Â Â list_for_each( iter, &sdom->vcpu )
>> > +Â Â Â Â Â Â {
>> > +Â Â Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu,
>> sdom_elem);
>> > +Â Â Â Â Â Â Â Â if ( local_sched[i].index == vcpu_index )
>> > +Â Â Â Â Â Â Â Â {
>> > +Â Â Â Â Â Â Â Â Â Â if ( local_sched[i].period <= 0 ||
>> local_sched[i].budget <= 0 )
>> > +Â Â Â Â Â Â Â Â Â Â Â Â Âreturn -EINVAL;
>> > +
>> > +Â Â Â Â Â Â Â Â Â Â svc->period = MICROSECS(local_sched[i].period);
>> > +Â Â Â Â Â Â Â Â Â Â svc->budget = MICROSECS(local_sched[i].budget);
>> > +Â Â Â Â Â Â Â Â Â Â break;
>> > +Â Â Â Â Â Â Â Â }
>> > +Â Â Â Â Â Â Â Â vcpu_index++;
>> > +Â Â Â Â Â Â }
>> > +Â Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>> > +Â Â Â Â }
>>
>> Considering a maximum size guest, these two nested loops could
>> require a couple of million iterations. That's too much without any
>> preemption checks in the middle.
>>
>
> The section protected by the lock is only the "list_for_each" loop, whose
> running time is limited by the number of vcpus of a domain (32 at most).

Since when is 32 the limit on the number of vCPU-s in a domain?

Based on Dario's suggestion, I'll use vcpu_id to locate the vcpu to set, which cost much

less time.

> If this does cause problems, I think adding a "hypercall_preempt_check()"
> at the outside "for" loop may help. Is that right?

Yes.

>> > --- a/xen/common/schedule.c
>> > +++ b/xen/common/schedule.c
>> > @@ -1093,7 +1093,9 @@ long sched_adjust(struct domain *d, struct
>> xen_domctl_scheduler_op *op)
>> >
>> >Â Â Â if ( (op->sched_id != DOM2OP(d)->sched_id) ||
>> >Â Â Â Â Â Â((op->cmd != XEN_DOMCTL_SCHEDOP_putinfo) &&
>> > -Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo)) )
>> > +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo) &&
>> > +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_putvcpuinfo) &&
>> > +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getvcpuinfo)) )
>>
>> Imo this should become a switch now.
>>
>
> Do you mean "switch ( op->cmd )" ? I'm afraid that would make it look more
> complicated.

This may be a matter of taste to a certain degree, but I personally
don't think a series of four almost identical comparisons reads any
better than its switch() replacement. But it being a style issue, the
ultimate decision is with George as the maintainer anyway.

Jan

Chong Li

Department of Computer Science and Engineering

Washington University in St.louis

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler