Xen project Mailing List

On Fri, May 8, 2015 at 2:49 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:

>>> On 07.05.15 at 19:05, <lichong659@xxxxxxxxx> wrote:
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1085,6 +1085,9 @@ rt_dom_cntl(
>Â Â Â struct list_head *iter;
>Â Â Â unsigned long flags;
>Â Â Â int rc = 0;
> +Â Â xen_domctl_sched_rtds_params_t *local_sched;
> +Â Â int vcpu_index=0;
> +Â Â int i;

unsigned int

> @@ -1110,6 +1113,67 @@ rt_dom_cntl(
>Â Â Â Â Â }
>Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>Â Â Â Â Â break;
> +Â Â case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
> +Â Â Â Â op->u.rtds.nr_vcpus = 0;
> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â op->u.rtds.nr_vcpus = vcpu_index;

Does dropping of the lock here and re-acquiring it below really work
race free?

Here, the lock is used in the same way as the ones in the two casesÂ

above (XEN_DOMCTL_SCHEDOP_get/putinfo). So I think if race free

is guaranteed in that two cases, the lock in this case works race free

as well.

> +Â Â Â Â local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> +Â Â Â Â Â Â Â Â vcpu_index);
> +Â Â Â Â if( local_sched == NULL )
> +Â Â Â Â {
> +Â Â Â Â Â Â return -ENOMEM;
> +Â Â Â Â }

Pointless braces.

> +Â Â Â Â vcpu_index = 0;
> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â {
> +Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +
> +Â Â Â Â Â Â local_sched[vcpu_index].budget = svc->budget / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].period = svc->period / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].index = vcpu_index;

What use is this index to the caller? I think you rather want to tell it
the vCPU number. That's especially also taking the use case of a
get/set pair into account - unless you tell me that these indexes can
never change, the indexes passed back into the set operation would
risk to have become stale by the time the hypervisor processes the
request.

I don't quite understand what the "stale" means. The array here (local_sched[ ])

and the array (in libxc) that local_sched[ ] is copied to are both used for this getÂ

operation only. When users set per-vcpu parameters, there are also dedicatedÂ

arrays for that set operation.

> +Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â }
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â copy_to_guest(op->u.rtds.vcpus, local_sched, vcpu_index);
> +Â Â Â Â xfree(local_sched);
> +Â Â Â Â rc = 0;
> +Â Â Â Â break;
> +Â Â case XEN_DOMCTL_SCHEDOP_putvcpuinfo:
> +Â Â Â Â local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> +Â Â Â Â Â Â Â Â op->u.rtds.nr_vcpus);

While above using xzalloc_array() is warranted for security reasons,
I don't see why you wouldn't be able to use xmalloc_array() here.

> +Â Â Â Â if( local_sched == NULL )
> +Â Â Â Â {
> +Â Â Â Â Â Â return -ENOMEM;
> +Â Â Â Â }
> +Â Â Â Â copy_from_guest(local_sched, op->u.rtds.vcpus, op->u.rtds.nr_vcpus);
> +
> +Â Â Â Â for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
> +Â Â Â Â {
> +Â Â Â Â Â Â vcpu_index = 0;
> +Â Â Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +Â Â Â Â Â Â Â Â if ( local_sched[i].index == vcpu_index )
> +Â Â Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â Â Â if ( local_sched[i].period <= 0 || local_sched[i].budget <= 0 )
> +Â Â Â Â Â Â Â Â Â Â Â Â Âreturn -EINVAL;
> +
> +Â Â Â Â Â Â Â Â Â Â svc->period = MICROSECS(local_sched[i].period);
> +Â Â Â Â Â Â Â Â Â Â svc->budget = MICROSECS(local_sched[i].budget);
> +Â Â Â Â Â Â Â Â Â Â break;
> +Â Â Â Â Â Â Â Â }
> +Â Â Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â Â Â }
> +Â Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â }

Considering a maximum size guest, these two nested loops could
require a couple of million iterations. That's too much without any
preemption checks in the middle.

The section protected by the lock is only the "list_for_each" loop, whose

running time is limited by the number of vcpus of a domain (32 at most).

If this does cause problems, I think adding a "hypercall_preempt_check()"Â

at the outside "for" loop may help. Is that right?

> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1093,7 +1093,9 @@ long sched_adjust(struct domain *d, struct xen_domctl_scheduler_op *op)
>
>Â Â Â if ( (op->sched_id != DOM2OP(d)->sched_id) ||
>Â Â Â Â Â Â((op->cmd != XEN_DOMCTL_SCHEDOP_putinfo) &&
> -Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo)) )
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo) &&
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_putvcpuinfo) &&
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getvcpuinfo)) )

Imo this should become a switch now.

Do you mean "switch ( op->cmd )" ? I'm afraid that would make it look more complicated.Â

> @@ -351,9 +361,12 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
>Â #define XEN_SCHEDULER_ARINC653 7
>Â #define XEN_SCHEDULER_RTDSÂ Â Â8
>
> -/* Set or get info? */
> +/* Set or get info */

Pointless change (and if you really mean to do it, then please such
that the comment in the end matches our coding style).

> @@ -373,8 +386,11 @@ struct xen_domctl_scheduler_op {
>Â Â Â Â Â Â Â uint16_t weight;
>Â Â Â Â Â } credit2;
>Â Â Â Â Â struct xen_domctl_sched_rtds {
> -Â Â Â Â Â Â uint32_t period;
> -Â Â Â Â Â Â uint32_t budget;
> +Â Â Â Â Â Â uint64_t period;
> +Â Â Â Â Â Â uint64_t budget;

This widening of the fields seems both unrelated and unnecessary.

Jan

Chong Li

Department of Computer Science and Engineering

Washington University in St.louis

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler