[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] xen:rtds:fix bug in accounting budget



The bug is introduced in Xen 4.7 when we converted RTDS scheduler
from quantum-driven model to event-driven model.
We assumed rt_schedule() is always called for a VCPU
before the VCPUs budget replenished handler.
This assumption does not hold, when system is overloaded, or
when the VCPU budget is almost equal its period.

Buggy behavior:
1) A VCPU may get less budget that assigned in a period.
2) A full capacity VCPU, i.e., a VCPU whose period is equal to budget,
   may not get any budget in some period.

Bug analysis:
1) A VCPU deadline can be fast-forwarded by more than one period.
   However, the VCPU last_start time was not updated immediately.
   If rt_schedule() is called after rt_update_deadline(), which happens
   when VCPU budget is equal to period or when VCPU has deadline miss,
   burn_budget() will burn the budget that was just replenished,
   although the replenished budget should be used in the most recent period 
only.

   We should update VCPU last_start time to the start of the current period
   when rt_update_deadline() updates a VCPU period.

2) When a full capacity VCPU depletes its budget and is context switching out,
   but has not updated the cores current running VCPU,
   the budget replenish timer may be triggerred.
   The replenish handler failed to re-schedule the full capacity VCPU
   because it thought the VCPU is running.

   When a VCPU budget is replenished, we try to tickle a CPU.
   When we find a core for a VCPU to tickle and the VCPU is context switching 
out,
   we will always tickle the core where the VCPU was running,
   if the VCPU cannot find another core to tickle

This bug was reported by Dagaen Golomb

Signed-off-by: Meng Xu

---
Cc: Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>
Cc: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
Cc: Linh Thi Xuan Phan <linhphan@xxxxxxxxxxxxx>
Cc: Haoran Li <lihaoran@xxxxxxxxx>
Cc: Meng Xu <xumengpanda@xxxxxxxxx>
---
 xen/common/sched_rt.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index d95f798..cdc5c06 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -407,6 +407,13 @@ rt_update_deadline(s_time_t now, struct rt_vcpu *svc)
         svc->cur_deadline += count * svc->period;
     }
 
+    /*
+     * rt_schedule may be scheduled after update deadline
+     * we should only deduct the budget consumed in current period
+     */
+    if ( svc->last_start < (svc->cur_deadline - svc->period) )
+        svc->last_start = svc->cur_deadline - svc->period;
+
     svc->cur_budget = svc->budget;
 
     /* TRACE */
@@ -1195,6 +1202,19 @@ runq_tickle(const struct scheduler *ops, struct rt_vcpu 
*new)
         goto out;
     }
 
+    /*
+     * new may be preempted due to out of budget
+     * new may replenish its budget before it is contexted switched out
+     * then new may preempt the to-be-scheduled task on its prev cpu
+     */
+    if ( curr_on_cpu(new->vcpu->processor) == new->vcpu &&
+         test_bit(__RTDS_delayed_runq_add, &new->flags) )
+    {
+        SCHED_STAT_CRANK(tickled_busy_cpu);
+        cpu_to_tickle = new->vcpu->processor;
+        goto out;
+    }
+
     /* didn't tickle any cpu */
     SCHED_STAT_CRANK(tickled_no_cpu);
     return;
@@ -1472,6 +1492,7 @@ static void repl_timer_handler(void *data){
     {
         svc = replq_elem(iter);
 
+        /* Another ready VCPU may preempt svc who updates its deadline */
         if ( curr_on_cpu(svc->vcpu->processor) == svc->vcpu &&
              !list_empty(runq) )
         {
@@ -1480,8 +1501,9 @@ static void repl_timer_handler(void *data){
             if ( svc->cur_deadline > next_on_runq->cur_deadline )
                 runq_tickle(ops, next_on_runq);
         }
-        else if ( vcpu_on_q(svc) &&
-                  __test_and_clear_bit(__RTDS_depleted, &svc->flags) )
+        /* svc may preempt another VCPU because it has budget again */
+        if ( __test_and_clear_bit(__RTDS_depleted, &svc->flags) &&
+             vcpu_runnable(svc->vcpu) )
             runq_tickle(ops, svc);
 
         list_del(&svc->replq_elem);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.