[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1 1/4] xen: enabling XL to set per-VCPU parameters of a domain for RTDS scheduler





On Fri, May 8, 2015 at 2:49 AM, Jan Beulich <JBeulich@xxxxxxxx> wrote:
>>> On 07.05.15 at 19:05, <lichong659@xxxxxxxxx> wrote:
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> @@ -1085,6 +1085,9 @@ rt_dom_cntl(
>Â Â Â struct list_head *iter;
>Â Â Â unsigned long flags;
>Â Â Â int rc = 0;
> +Â Â xen_domctl_sched_rtds_params_t *local_sched;
> +Â Â int vcpu_index=0;
> +Â Â int i;

unsigned int

> @@ -1110,6 +1113,67 @@ rt_dom_cntl(
>Â Â Â Â Â }
>Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
>Â Â Â Â Â break;
> +Â Â case XEN_DOMCTL_SCHEDOP_getvcpuinfo:
> +Â Â Â Â op->u.rtds.nr_vcpus = 0;
> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â op->u.rtds.nr_vcpus = vcpu_index;

Does dropping of the lock here and re-acquiring it below really work
race free?

Here, the lock is used in the same way as the ones in the two casesÂ
above (XEN_DOMCTL_SCHEDOP_get/putinfo). So I think if race free
is guaranteed in that two cases, the lock in this case works race free
as well.


> +Â Â Â Â local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> +Â Â Â Â Â Â Â Â vcpu_index);
> +Â Â Â Â if( local_sched == NULL )
> +Â Â Â Â {
> +Â Â Â Â Â Â return -ENOMEM;
> +Â Â Â Â }

Pointless braces.

> +Â Â Â Â vcpu_index = 0;
> +Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â {
> +Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +
> +Â Â Â Â Â Â local_sched[vcpu_index].budget = svc->budget / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].period = svc->period / MICROSECS(1);
> +Â Â Â Â Â Â local_sched[vcpu_index].index = vcpu_index;

What use is this index to the caller? I think you rather want to tell it
the vCPU number. That's especially also taking the use case of a
get/set pair into account - unless you tell me that these indexes can
never change, the indexes passed back into the set operation would
risk to have become stale by the time the hypervisor processes the
request.

I don't quite understand what the "stale" means. The array here (local_sched[ ])
and the array (in libxc) that local_sched[ ] is copied to are both used for this getÂ
operation only. When users set per-vcpu parameters, there are also dedicatedÂ
arrays for that set operation.
Â

> +Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â }
> +Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â copy_to_guest(op->u.rtds.vcpus, local_sched, vcpu_index);
> +Â Â Â Â xfree(local_sched);
> +Â Â Â Â rc = 0;
> +Â Â Â Â break;
> +Â Â case XEN_DOMCTL_SCHEDOP_putvcpuinfo:
> +Â Â Â Â local_sched = xzalloc_array(xen_domctl_sched_rtds_params_t,
> +Â Â Â Â Â Â Â Â op->u.rtds.nr_vcpus);

While above using xzalloc_array() is warranted for security reasons,
I don't see why you wouldn't be able to use xmalloc_array() here.

> +Â Â Â Â if( local_sched == NULL )
> +Â Â Â Â {
> +Â Â Â Â Â Â return -ENOMEM;
> +Â Â Â Â }
> +Â Â Â Â copy_from_guest(local_sched, op->u.rtds.vcpus, op->u.rtds.nr_vcpus);
> +
> +Â Â Â Â for( i = 0; i < op->u.rtds.nr_vcpus; i++ )
> +Â Â Â Â {
> +Â Â Â Â Â Â vcpu_index = 0;
> +Â Â Â Â Â Â spin_lock_irqsave(&prv->lock, flags);
> +Â Â Â Â Â Â list_for_each( iter, &sdom->vcpu )
> +Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â struct rt_vcpu *svc = list_entry(iter, struct rt_vcpu, sdom_elem);
> +Â Â Â Â Â Â Â Â if ( local_sched[i].index == vcpu_index )
> +Â Â Â Â Â Â Â Â {
> +Â Â Â Â Â Â Â Â Â Â if ( local_sched[i].period <= 0 || local_sched[i].budget <= 0 )
> +Â Â Â Â Â Â Â Â Â Â Â Â Âreturn -EINVAL;
> +
> +Â Â Â Â Â Â Â Â Â Â svc->period = MICROSECS(local_sched[i].period);
> +Â Â Â Â Â Â Â Â Â Â svc->budget = MICROSECS(local_sched[i].budget);
> +Â Â Â Â Â Â Â Â Â Â break;
> +Â Â Â Â Â Â Â Â }
> +Â Â Â Â Â Â Â Â vcpu_index++;
> +Â Â Â Â Â Â }
> +Â Â Â Â Â Â spin_unlock_irqrestore(&prv->lock, flags);
> +Â Â Â Â }

Considering a maximum size guest, these two nested loops could
require a couple of million iterations. That's too much without any
preemption checks in the middle.

The section protected by the lock is only the "list_for_each" loop, whose
running time is limited by the number of vcpus of a domain (32 at most).
If this does cause problems, I think adding a "hypercall_preempt_check()"Â
at the outside "for" loop may help. Is that right?


> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -1093,7 +1093,9 @@ long sched_adjust(struct domain *d, struct xen_domctl_scheduler_op *op)
>
>Â Â Â if ( (op->sched_id != DOM2OP(d)->sched_id) ||
>Â Â Â Â Â Â((op->cmd != XEN_DOMCTL_SCHEDOP_putinfo) &&
> -Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo)) )
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getinfo) &&
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_putvcpuinfo) &&
> +Â Â Â Â Â (op->cmd != XEN_DOMCTL_SCHEDOP_getvcpuinfo)) )

Imo this should become a switch now.

Do you mean "switch ( op->cmd )" ? I'm afraid that would make it look more complicated.Â


> @@ -351,9 +361,12 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
>Â #define XEN_SCHEDULER_ARINC653 7
>Â #define XEN_SCHEDULER_RTDSÂ Â Â8
>
> -/* Set or get info? */
> +/* Set or get info */

Pointless change (and if you really mean to do it, then please such
that the comment in the end matches our coding style).

> @@ -373,8 +386,11 @@ struct xen_domctl_scheduler_op {
>Â Â Â Â Â Â Â uint16_t weight;
>Â Â Â Â Â } credit2;
>Â Â Â Â Â struct xen_domctl_sched_rtds {
> -Â Â Â Â Â Â uint32_t period;
> -Â Â Â Â Â Â uint32_t budget;
> +Â Â Â Â Â Â uint64_t period;
> +Â Â Â Â Â Â uint64_t budget;

This widening of the fields seems both unrelated and unnecessary.

Jan




--
Chong Li
Department of Computer Science and Engineering
Washington University in St.louis
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.