|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC] RTDS scheduler: potential issues found during safety analysis
(dropping Dario's duplicate but wrong email; adding Jürgen - see ./MAINTAINERS) On 19.03.2026 18:49, Oleksii Moisieiev wrote: > Hi all, > We have been performing analysis of the RTDS > scheduler code (xen/common/sched/rt.c) and identified several potential > issues that we would like to bring to the community's attention. We would > appreciate your feedback on whether these issues are considered worth > addressing, and if so, what the preferred approach would be. The question is a little odd: If you mean to use the scheduler in production, a goal is going to be to move it out of experimental state. For this, sorting issues like the ones you enumerate is pretty much a requirement. And yes, the suggested approaches look plausible to me. One other aspect: Please don't send HTML mail. I'm leaving everything as reply context, for you to see how it ends up looking when read as plain text (with my mail UI at least). Jan > Below is a summary of the findings. All references are to the current > upstream code. > 1. Inconsistent validation in domain-wide vs per-vCPU parameter update > ---------------------------------------------------------------------- > In rt_dom_cntl(), the XEN_DOMCTL_SCHEDOP_putinfo path (domain-wide > parameter update) only validates: > if ( op->u.rtds.period == 0 || op->u.rtds.budget == 0 ) > In contrast, the XEN_DOMCTL_SCHEDOP_putvcpuinfo path (per-vCPU update) > enforces stricter checks: > if ( period > RTDS_MAX_PERIOD || budget < RTDS_MIN_BUDGET || > budget > period || period < RTDS_MIN_PERIOD ) > This means the domain-wide path accepts configurations where budget > exceeds period, or where period/budget fall below the 10 us minimum that > the per-vCPU path enforces. Such parameters can lead to scheduling > overhead issues (very short periods) or over-allocation (budget > period). > Suggested fix: apply identical validation constraints on both paths, i.e. > add the same bounds checks (budget <= period, period >= RTDS_MIN_PERIOD, > budget >= RTDS_MIN_BUDGET, period <= RTDS_MAX_PERIOD) to the putinfo path. > Additionally, the putinfo path does not handle the extratime flag at all, > unlike the putvcpuinfo path. > 2. Priority level overflow for extratime vCPUs > ---------------------------------------------- > In burn_budget(), when an extratime vCPU exhausts its budget: > svc->priority_level++; > svc->cur_budget = svc->budget; > The priority_level field is declared as `unsigned` (32-bit) and there is no > upper bound check before the increment. While rt_update_deadline() resets > priority_level to 0 at each period rollover, for a long-running extratime > vCPU that continuously exhausts its budget within a single period, the > counter could theoretically wrap from UINT_MAX to 0. Since priority_level 0 > represents the highest scheduling priority, a wraparound would cause the > extratime vCPU to suddenly preempt vCPUs with active real-time reservations. > While this scenario requires an extreme number of budget exhaustion cycles > within a single period, it is a concern for long-running embedded or safety > systems that operate without reboot for extended durations. > Suggested fix: saturate priority_level at a safe maximum value (e.g., > UINT_MAX - 1) instead of allowing unbounded increment. > 3. Replenishment timer loss during CPU pool reconfiguration > ----------------------------------------------------------- > When the last pCPU is removed from an RTDS CPU pool, move_repl_timer() > kills the replenishment timer via kill_timer(). When a pCPU is later > re-added, rt_switch_sched() re-initializes the timer object (if status > is TIMER_STATUS_killed) but does not re-arm it from the existing > replenishment queue. If the replq already contains pending entries, those > replenishments will not fire until some other event explicitly calls > set_timer(), potentially stalling all non-extratime vCPUs. > We believe this is actually a broader issue that goes beyond the RTDS > scheduler: the common cpupool infrastructure probably should not allow > a cpupool that has assigned vCPUs to lose all of its pCPUs. Preventing > such a state at the cpupool management level would address the root cause > for all schedulers, not just RTDS. > Suggested fix (RTDS-specific): when timer ownership is re-established > in rt_switch_sched(), re-arm the replenishment timer to the earliest > deadline in the replq if the queue is non-empty. > Suggested fix (common): the cpupool code should refuse to remove the > last pCPU from a cpupool that still has domains/vCPUs assigned to it, > returning an error instead. This would prevent the problematic state > from arising in the first place. > 4. Missing scheduling notification on vCPU insertion > ---------------------------------------------------- > rt_unit_insert() inserts runnable units into the replenishment and run > queues but does not call runq_tickle(). In contrast, rt_unit_wake() and > rt_context_saved() both call runq_tickle() after runq_insert(). This > means a newly inserted vCPU with a higher priority (earlier deadline) > than currently running vCPUs will not be considered for execution until > the next natural scheduling event (timer, sleep, budget expiry), which > can delay scheduling by up to one full period. > Suggested fix: add a runq_tickle() call after the runq_insert() in > rt_unit_insert(), following the same pattern used in rt_unit_wake(). > 5. Stale scheduling flags on vCPU removal during context switch > --------------------------------------------------------------- > rt_unit_remove() removes queue membership via q_remove()/replq_remove() > but does not clear the RTDS_delayed_runq_add or RTDS_scheduled flags. > If a vCPU is removed while it is being context-switched off a pCPU (i.e., > RTDS_scheduled is set and RTDS_delayed_runq_add may be set), > rt_context_saved() will later clear RTDS_scheduled and, finding > RTDS_delayed_runq_add set, will re-insert the removed vCPU into the run > queue via runq_insert() + runq_tickle(). This results in a stale vCPU > reference on the scheduler's run queue, belonging to a domain that may be > in the process of destruction or migration. > Suggested fix: in rt_unit_remove(), explicitly clear RTDS_delayed_runq_add > and RTDS_scheduled flags after removing queue membership, so that > rt_context_saved() cannot re-insert a removed vCPU. > We would appreciate any feedback on these findings. We are happy to > prepare patches for any of the issues the community considers worth > fixing. > Best regards, > Oleksii Moisieiev > >
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |