[Xen-devel] Re: credit scheduler error rates as reported by HP a

Hi Mike,

> 
> My first observation is that the credit scheduler will select a vcpu
> that has exceeded its credit when there is no other work to be done on
> any of the other physical cpus in the system.

In the version of the paper that you read and refer to, we consciously 
considered  the three scheduler comparison using 1 CPU machine:
the goal was to compare the "BASIC" scheduler functionality.
I will present a bit more results for 2-CPU case during the Xen Summit.

> 
> In light of the paper, with very low allocation targets for vcpus, it
> is not surprising that the positive allocation errors can be quite
> large. It is also not surprising that the errors (and error
> distribution) decrease with larger allocation targets. 

Because of 1-CPU machine, the explanation of this phenomena is different
(it is not related to load balancing of VCPUs) and the Credit scheduler 
can/should  be made more precise.
What our paper does not show is the original error distribution for Credit 
(original -- means after it was released). The resulst that you see in
the paper are with the next, significantly improved version by Emmanuel. 
I beleive that there is still a significant room for improvement.

> 
> None of this explains the negative allocation errors, where the vcpu's
> received less than their pcpu allotments. I speculate that a couple of
> circumstances may contribute to negative allocation errors:
> 
> very low weights attached to domains will cause the credit scheduler
> to attempt to pause vcpus almost every accounting cycle. vcpus may
> therefore not have as many opportunities to run as frequently as
> possible. If the ALERT measument method is different, or has a
> different interval, than the credit schedulers 10ms tick and 30ms
> accounting cycle, negative errors may result in the view of ALERT. 

ALERT benchmark is setting the allocation of a SINGLE domain (on 1 CPU machine,
no other competing domains while running this benchmark) to a chosen
target CPU allocation, e.g., 20%, in the non-work-conserving mode. 
It means that the CPU allocation is CAPPED by 20%. This single domain runs 
"slurp" (a tight CPU loop, 1 process) to consume the allocated CPU share.

The monitoring part of ALERT just collects the measurements from the system
using both XenMon and xentop with 1 second reporting granularity
Since 1 sec is so much larger than 30 ms slices, there should be possible
to get a very accurate CPU allocation for larger CPU allocation targets.
However, for 1% CPU allocation you have an immediate error, because
Credit will allocate 30ms slice (that is 3% of 1 sec). If Credit
would use 10 sec slices than the error will be (theoretically) bounded
to 1%. 

The expectations are that each 1 sec measurements should show 20% CPU 
utilization for this domain.

We run ALERT for different CPU allocation targets from 1% to 90%.
The reported error is the error between the targetted CPU allocation and 
the measured CPU allocation at 1 sec granularity.

> 
> I/O activity: if ALERT performans I/O activity the test, even though
> it is "cpu intensive" may cause domu to block on dom0 frequently,
> meaning it will idle more, especially if dom0 has a low credit
> allocation.

There are no I/O activities, ALERT functionality is very special as 
described above: nothing else is happening in the system.


> 
> Questions: how does ALERT measure actual cpu allocation? Using Xenmon?

As, I've mentioned above we have measurements from both XenMon and xentop,
they are very close for these experiments.

> How does the ALERT exersize the domain? 

ALERT runs "slurp", a cpu-hungry loop, which will "eat"
as much CPU as you allocate to it. It is a single process application.


The paper didn't mention the
> actual system calls and hypercalls the domains are making when running
> ALERT.

There is none of such: it is a pure user space benchmark.


Best regards, Lucy

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: credit scheduler error rates as reported by HP and UCSD