[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model





On 2/25/2016 6:31 PM, Dario Faggioli wrote:
Hey again,

Thanks for turning up so quickly.

We are getting closer and closer, although (of course :-)) I have some
more comments.

However, is there a particular reason why you are keeping the RFC tag?
Until you do that, it's like saying that you are chasing feedback, but
you do not think yourself the patch should be considered for being
upstreamed. As far as my opinion goes, this patch is not ready to go in
right now (as I said, I've got questions and comments), but its status
is way past RFC.


Oh OK, I had the impression that RFC means request for comments and it should always be used because indeed, I'm asking for comments.

On Thu, 2016-02-25 at 15:05 -0500, Tianyang Chen wrote:
changes since v5:
     removed unnecessary vcpu_on_replq() checks
     deadline_queue_insert() returns a flag to indicate if it's
     needed to re-program the timer
     removed unnecessary timer checks
     added helper functions to remove vcpus from queues
     coding style

Changes since v4:
     removed unnecessary replenishment queue checks in vcpu_wake()
     extended replq_remove() to all cases in vcpu_sleep()
     used _deadline_queue_insert() helper function for both queues
     _replq_insert() and _replq_remove() program timer internally

Changes since v3:
     removed running queue.
     added repl queue to keep track of repl events.
     timer is now per scheduler.
     timer is init on a valid cpu in a cpupool.

So, this does not belong here. It is ok to have it in this part of the
email, but it should not land in the actual commit changelog, once the
patch will be committed into Xen's git tree.

The way to achieve the above is to put this summary of changes below
the actual changelog, and below the Signed-of-by lines, after a marker
that looks like this "---".

Budget replenishment and enforcement are separated by adding
a replenishment timer, which fires at the next most imminent
release time of all runnable vcpus.

A new runningq has been added to keep track of all vcpus that
are on pcpus.

Mmm.. Is this the proper changelog? runningq is something we discussed,
and that appeared in v2, but is certainly no longer here... :-O


oops...

Wait, maybe you misunderstood and/or I did not make myself clear enough
(in which case, sorry). I never meant to say "always stop the timer".
Atually, in one of my last email I said the opposite, and I think that
would be the best thing to do.

Do you think there is any specific reason why we need to always stop
and restart it? If not, I think we can:
  - have deadline_queue_remove() also return whether the element
    removed was the first one (i.e., the one the timer was programmed
    after);
  - if it was, re-program the timer after the new front of the queue;
  - if it wasn't, nothing to do.

Thoughts?


It was my thought originally that the timer needs to be re-programmed only when the top vcpu is taken off. So did you mean when I manipulated repl_timer->expires manually, the timer should be stopped using proper timer API? The manipulation is gone now. Also, set_timer internally disables the timer so I assume it's safe just to call set_timer() directly, right?

@@ -405,22 +500,38 @@ __runq_insert(const struct scheduler *ops,
struct rt_vcpu *svc)

      /* add svc to runq if svc still has budget */
      if ( svc->cur_budget > 0 )
-    {
-        list_for_each(iter, runq)
-        {
-            struct rt_vcpu * iter_svc = __q_elem(iter);
-            if ( svc->cur_deadline <= iter_svc->cur_deadline )
-                    break;
-         }
-        list_add_tail(&svc->q_elem, iter);
-    }
+        deadline_queue_insert(&__q_elem, svc, &svc->q_elem, runq);
      else
      {
          list_add(&svc->q_elem, &prv->depletedq);
+        ASSERT( __vcpu_on_replq(svc) );

So, by doing this, you're telling me that, if the vcpu is added to the
depleted queue, there must be a replenishment planned for it (or the
ASSERT() would fail).

But if we are adding it to the runq, there may or may not be a
replenishment planned for it.

Is this what we want? Why there must not be a replenishment planned
already for a vcpu going into the runq (to happen at its next
deadline)?


It looks like the current code doesn't add a vcpu to the replenishment queue when vcpu_insert() is called. When the scheduler is initialized, all the vcpus are added to the replenishment queue after waking up from sleep. This needs to be changed (add vcpu to replq in vcpu_insert()) to make it consistent in a sense that when rt_vcpu_insert() is called, it needs to have a corresponding replenishment event queued.

This way the ASSERT() is for both cases in __runq_insert() to enforce the fact that "when a vcpu is inserted to runq/depletedq, a replenishment event is waiting for it".

@@ -1087,10 +1193,6 @@ static void
  rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
  {
      struct rt_vcpu *svc = rt_vcpu(vc);
-    struct rt_vcpu *snext = NULL;
-    struct rt_dom *sdom = NULL;
-    struct rt_private *prv = rt_priv(ops);
-    cpumask_t *online;
      spinlock_t *lock = vcpu_schedule_lock_irq(vc);

      clear_bit(__RTDS_scheduled, &svc->flags);
@@ -1102,14 +1204,7 @@ rt_context_saved(const struct scheduler *ops,
struct vcpu *vc)
           likely(vcpu_runnable(vc)) )
      {
          __runq_insert(ops, svc);
-        __repl_update(ops, NOW());
-
-        ASSERT(!list_empty(&prv->sdom));
-        sdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem);
-        online = cpupool_domain_cpumask(sdom->dom);
-        snext = __runq_pick(ops, online); /* pick snext from ALL
cpus */
-
-        runq_tickle(ops, snext);
+        runq_tickle(ops, svc);
      }

So, here we are.

What I meant was to make this function look more or less like this:

static void
rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
{
     struct rt_vcpu *svc = rt_vcpu(vc);
     spinlock_t *lock = vcpu_schedule_lock_irq(vc);

     clear_bit(__RTDS_scheduled, &svc->flags);
     /* not insert idle vcpu to runq */
     if ( is_idle_vcpu(vc) )
         goto out;

     if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags) &&
          likely(vcpu_runnable(vc)) )
     {
         __runq_insert(ops, svc);
         runq_tickle(ops, snext);
     }
     else
         __replq_remove(ops, svc);
out:
     vcpu_schedule_unlock_irq(lock, vc);
}

And, as said above, if you do this, try also removing the
__replq_remove() call from rt_vcpu_sleep(), this one should cover for
that (and, actually, more than just that!).

After all this, check whether you still have the assert in
__replq_insert() triggering and let me know

So after moving the __replq_remove() to rt_context_saved(), the assert in __replq_insert() still fails when dom0 boots up.

(XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524
(XEN) ----[ Xen-4.7-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64
(XEN) RFLAGS: 0000000000010002   CONTEXT: hypervisor (d0v3)
(XEN) rax: 0000000000000001   rbx: ffff83023b522940   rcx: 0000000000000001
(XEN) rdx: 00000031bb1b9980   rsi: ffff83023b486760   rdi: ffff83023b486760
(XEN) rbp: ffff8300bfcffd48   rsp: ffff8300bfcffd28   r8:  0000000000000004
(XEN) r9:  00000000deadbeef   r10: ffff82d08025f5a0   r11: 0000000000000206
(XEN) r12: ffff83023b486760   r13: ffff83023b522d80   r14: ffff83023b4b5000
(XEN) r15: 000000023a6e774b   cr0: 0000000080050033   cr4: 00000000000406a0
(XEN) cr3: 0000000231c0d000   cr2: ffff8802200d81f8
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff8300bfcffd28:
(XEN)    ffff82d08019642e ffff83023b486760 ffff8300bfd47000 ffff82d080299b00
(XEN)    ffff8300bfcffd88 ffff82d08012a072 ffff8300bfcffd70 ffff8300bfd47000
(XEN)    000000023a6e75c0 ffff83023b522940 ffff82d08032bc00 0000000000000282
(XEN)    ffff8300bfcffdd8 ffff82d08012be2c ffff83023b4b5000 ffff83023b4f1000
(XEN)    ffff8300bfd44000 ffff8300bfd47000 0000000000000001 ffff83023b4b40c0
(XEN)    ffff880230c14440 0000000000000000 ffff8300bfcffde8 ffff82d08012c347
(XEN)    ffff8300bfcffe08 ffff82d080169cea ffff83023b4b5000 0000000000000003
(XEN)    ffff8300bfcffe18 ffff82d080169d65 ffff8300bfcffe38 ffff82d08010762a
(XEN)    ffff83023b4b40c0 ffff83023b4b5000 ffff8300bfcffe68 ffff82d08010822a
(XEN)    ffff8300bfcffe68 fffffffffffffff2 ffff88022061dcb4 0000000000000000
(XEN)    ffff8300bfcffef8 ffff82d0801096fc 0000000000000001 ffff8300bfcfff18
(XEN)    ffff8300bfcffef8 ffff82d080240e85 ffff8300bfcf8000 0000000000000000
(XEN)    0000000000000246 ffffffff810013aa 0000000000000003 ffffffff810013aa
(XEN)    ffff8300bfcffee8 ffff8300bfd44000 ffff8802205e8000 0000000000000000
(XEN)    ffff880230c14440 0000000000000000 00007cff403000c7 ffff82d0802439e2
(XEN)    ffffffff8100140a 0000000000000020 0000000000000003 ffff880230c71900
(XEN)    ffff8802206584d0 ffff880220658000 ffff88022061dcb8 0000000000000000
(XEN)    0000000000000206 0000000000000000 ffff880223000168 ffff880223408e00
(XEN)    0000000000000020 ffffffff8100140a 0000000000000000 ffff88022061dcb4
(XEN)    0000000000000004 0001010000000000 ffffffff8100140a 000000000000e033
(XEN) Xen call trace:
(XEN)    [<ffff82d080128c07>] sched_rt.c#__replq_insert+0x2b/0x64
(XEN)    [<ffff82d08012a072>] sched_rt.c#rt_vcpu_wake+0xf2/0x12c
(XEN)    [<ffff82d08012be2c>] vcpu_wake+0x213/0x3d4
(XEN)    [<ffff82d08012c347>] vcpu_unblock+0x4b/0x4d
(XEN)    [<ffff82d080169cea>] vcpu_kick+0x20/0x6f
(XEN)    [<ffff82d080169d65>] vcpu_mark_events_pending+0x2c/0x2f
(XEN)    [<ffff82d08010762a>] event_2l.c#evtchn_2l_set_pending+0xa9/0xb9
(XEN)    [<ffff82d08010822a>] evtchn_send+0x158/0x183
(XEN)    [<ffff82d0801096fc>] do_event_channel_op+0xe21/0x147d
(XEN)    [<ffff82d0802439e2>] lstar_enter+0xe2/0x13c
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion '!vcpu_on_replq(svc)' failed at sched_rt.c:524
(XEN) ****************************************

Thanks,
Tianyang

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.