[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Introduce rt real-time scheduler for Xen



Hi all,

This serie of patches adds rt real-time scheduler to Xen.

In summary, It supports:
1) Preemptive Global Earliest Deadline First scheduling policy by using a 
global RunQ for the scheduler;
2) Assign/display each VCPU's parameters of each domain;
3) Supports CPU Pool

Based on the review comments/suggestion on version 1, the version 2 has the 
following improvements:
    a) Changed the interface of get/set vcpu's parameter from staticlly 
allocating a large array to dynamically allocating memory based on the number 
of vcpus of a domain. (This is a major change from v1 to v2 because many 
comments on v1 is about the code related to this functionality.)
    b) Changed the time unit at user interferce from 1ms to 1us; Changed the 
type of period and buget of a VCPU from uint16 to s_time_s.
    c) Changed the code style, rearrange the patch order, add comments to 
better explain the code.
    d) Domain 0 is no longer treated as a special domain. Domain 0's VCPUs are 
changed to be the same with domUs' VCPUs: domain 0's VCPUs have the same 
default value with domUs' VCPUs and are scheduled as domUs' VCPUs.
    e) Add more ASSERT(), e.g., in  __runq_insert() in sched_rt.c

-----------------------------------------------------------------------------------------------------------------------------
TODO:
    a) Add TRACE() in sched_rt.c functions.[easy]
       We will add a few xentrace tracepoints, like TRC_CSCHED2_RUNQ_POS in 
credit2 scheduler, in rt scheduler, to debug via tracing.
    b) Split runnable and depleted (=no budget left) VCPU queues.[easy]
    c) Deal with budget overrun in the algorithm [medium]
    d) Try using timers for replenishment, instead of scanning the full 
runqueue every now and then [medium]
    e) Reconsider the rt_vcpu_insert() and rt_vcpu_remove() for cpu pool 
support. 
    f) Method of improving the performance of rt scheduler [future work]
       VCPUs of the same domain may preempt each other based on the preemptive 
global EDF scheduling policy. This self-switch issue does not bring benefit to 
the domain but introduce more overhead. When this situation happens, we can 
simply promote the current running lower-priority VCPUâs priority and let it  
borrow budget from higher priority VCPUs to avoid such self-swtich issue.

Plan: 
    TODO a) and b) are expected in RFC v3; (2 weeks)
    TODO c), d) and e) are expected in RFC v4, v5; (3-4 weeks)
    TODO f) will be delayed after this scheduler is upstreamed because the 
improvement will make the scheduler not a pure global EDF scheduler.

-----------------------------------------------------------------------------------------------------------------------------
The design of this rt scheduler is as follows:
This rt scheduler follows the Preemptive Global Earliest Deadline First (GEDF) 
theory in real-time field.
Each VCPU can have a dedicated period and budget. While scheduled, a VCPU burns 
its budget. Each VCPU has its budget replenished at the beginning of each of 
its periods; Each VCPU discards its unused budget at the end of each of its 
periods. If a VCPU runs out of budget in a period, it has to wait until next 
period.
The mechanism of how to burn a VCPU's budget depends on the server mechanism 
implemented for each VCPU.
The mechanism of deciding the priority of VCPUs at each scheduling point is 
based on the Preemptive Global Earliest Deadline First scheduling scheme.

Server mechanism: a VCPU is implemented as a deferrable server.
When a VCPU has a task running on it, its budget is continuously burned;
When a VCPU has no task but with budget left, its budget is preserved.

Priority scheme: Global Earliest Deadline First (EDF).
At any scheduling point, the VCPU with earliest deadline has highest priority.

Queue scheme: A global runqueue for each CPU pool.
The runqueue holds all runnable VCPUs.
VCPUs in the runqueue are divided into two parts: with and without remaining 
budget.
At each part, VCPUs are sorted based on GEDF priority scheme.

Scheduling quanta: 1 ms.

If you are intersted in the details of the design and evaluation of this rt 
scheduler, please refer to our paper "Real-Time Multi-Core Virtual Machine 
Scheduling in Xen" (http://www.cis.upenn.edu/~mengxu/emsoft14/emsoft14.pdf) in 
EMSOFT14. This paper has the following details:
    a) Desgin of this scheduler;
    b) Measurement of the implementation overhead, e.g., scheduler overhead, 
context switch overhead, etc. 
    c) Comparison of this rt scheduler and credit scheduler in terms of the 
real-time performance.
-----------------------------------------------------------------------------------------------------------------------------
One scenario to show the functionality of this rt scheduler is as follows:
//list each vcpu's parameters of each domain in cpu pools using rt scheduler
#xl sched-rt
Cpupool Pool-0: sched=EDF
Name                                ID VCPU Period Budget
Domain-0                             0    0  10000  10000
Domain-0                             0    1  20000  20000
Domain-0                             0    2  30000  30000
Domain-0                             0    3  10000  10000
litmus1                              1    0  10000   4000
litmus1                              1    1  10000   4000

//set the parameters of the vcpu 1 of domain litmus1:
# xl sched-rt -d litmus1 -v 1 -p 20000 -b 10000

//domain litmus1's vcpu 1's parameters are changed, display each VCPU's 
parameters separately:
# xl sched-rt -d litmus1
Name                                ID VCPU Period Budget
litmus1                              1    0  10000   4000
litmus1                              1    1  20000  10000

// list cpupool information
xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              12        rt       y          2

//create a cpupool test
#xl cpupool-cpu-remove Pool-0 11
#xl cpupool-cpu-remove Pool-0 10
#xl cpupool-create name=\"test\" sched=\"credit\"
#xl cpupool-cpu-add test 11
#xl cpupool-cpu-add test 10
#xl cpupool-list
Name               CPUs   Sched     Active   Domain count
Pool-0              10        rt       y          2
test                 2    credit       y          0

//migrate litmus1 from cpupool Pool-0 to cpupool test.
#xl cpupool-migrate litmus1 test

//now litmus1 is in cpupool test
# xl sched-credit
Cpupool test: tslice=30ms ratelimit=1000us
Name                                ID Weight  Cap
litmus1                              1    256    0

-----------------------------------------------------------------------------------------------------------------------------
[PATCH RFC v2 1/4] xen: add real time scheduler rt
[PATCH RFC v2 2/4] libxc: add rt scheduler
[PATCH RFC v2 3/4] libxl: add rt scheduler
[PATCH RFC v2 4/4] xl: introduce rt scheduler
-----------------------------------------------------------------------------------------------------------------------------
Thanks for Dario, Wei, Ian, Andrew, George, and Konrad for your valuable 
comments and suggestions!

Any comment, question, and concerns are more than welcome! :-)

Thank you very much!

Best,

Meng

---
Meng Xu
PhD Student in Computer and Information Science
University of Pennsylvania

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.