[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"



On 11/09/11 14:35, Jan Beulich wrote:
On 18.10.11 at 22:42, Laszlo Ersek<lersek@xxxxxxxxxx>  wrote:
... because the "clock_event_device framework" already accounts for idle
time through the "event_handler" function pointer in
xen_timer_interrupt().

The patch is intended as the completion of [1]. It should fix the double
idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to
stolen time accounting (the removed code seems to be isolated).

After some more looking around I still think it's incorrect, albeit for
a different reason: What tick_nohz_restart_sched_tick() accounts
as idle time is *all* time that passed while in cpu_idle(). What gets
accounted in do_stolen_accounting() (without your patch) is
different:
- time the vCPU was in RUNSTATE_blocked gets accounted as idle
- time the vCPU was in RUNSTATE_runnable and RUNSTATE_offline
   gets accounted as stolen.

That is, on an overcommitted system (and without your patch) I
would expect you to not see the (full) double idle increment for a not
fully idle and not fully loaded vCPU.

I tried to verify this with an experiment. Please examine if the experiment is bogus or not.

On a four-PCPU host (hyperthreading off, RHEL-5.7+ hypervisor & dom0) I started three virtual machines:

VM1: four VCPUs, four processes running a busy loop each, independently.
VM2: ditto
VM3: single VCPU running the attached program (which otherwise puts 1/2 load on a single CPU, virtual or physical.) OS is RHEL-6.1.

In VM3, I also ran this script:

$ grep cpu0 /proc/stat; sleep 20; grep cpu0 /proc/stat
cpu0 10421 0 510 119943 608 0 1 122 0
cpu0 11420 0 510 121942 608 0 1 126 0

The difference in the fourth numerical column is still 1999, even though only 10 seconds of those 20 were spent idly.

Does the experiment miss the point (or do I), or does this disprove the idea?

(Interestingly, according to virt-manager, the load distribution between the VMs looked like:

VM1: 7/16 = 43.75%
VM2: 7/16 = 43.75%
VM3: 2/16 = 1/8 = 12.50%

as if VM3's load had been first extracted and the rest split between VM1 and VM2. When I stop VM1 and VM2, VM3 stays at 12.5%. Under the above load, I would have expected:

VM1: 8/17 ~= 47.06%
VM2: 8/17 ~= 47.06%
VM3: 1/17 ~= 5.88%

ie. "eight and half" VCPUs sharing the host evenly. Could this have any relevance?)

Thank you
Laszlo

Attachment: 50.c
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.