[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"

To: Jan Beulich <JBeulich@xxxxxxxx>
From: Laszlo Ersek <lersek@xxxxxxxxxx>
Date: Wed, 09 Nov 2011 18:47:26 +0100
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Joe Jin <joe.jin@xxxxxxxxxx>, Zhenzhong Duan <zhenzhong.duan@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Delivery-date: Wed, 09 Nov 2011 09:46:34 -0800
List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 11/09/11 14:35, Jan Beulich wrote:

On 18.10.11 at 22:42, Laszlo Ersek<lersek@xxxxxxxxxx>  wrote:

... because the "clock_event_device framework" already accounts for idle
time through the "event_handler" function pointer in
xen_timer_interrupt().

The patch is intended as the completion of [1]. It should fix the double
idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to
stolen time accounting (the removed code seems to be isolated).


After some more looking around I still think it's incorrect, albeit for
a different reason: What tick_nohz_restart_sched_tick() accounts
as idle time is *all* time that passed while in cpu_idle(). What gets
accounted in do_stolen_accounting() (without your patch) is
different:
- time the vCPU was in RUNSTATE_blocked gets accounted as idle
- time the vCPU was in RUNSTATE_runnable and RUNSTATE_offline
   gets accounted as stolen.

That is, on an overcommitted system (and without your patch) I
would expect you to not see the (full) double idle increment for a not
fully idle and not fully loaded vCPU.

I tried to verify this with an experiment. Please examine if theexperiment is bogus or not.

On a four-PCPU host (hyperthreading off, RHEL-5.7+ hypervisor & dom0) Istarted three virtual machines:


VM1: four VCPUs, four processes running a busy loop each, independently.
VM2: ditto

VM3: single VCPU running the attached program (which otherwise puts 1/2load on a single CPU, virtual or physical.) OS is RHEL-6.1.


In VM3, I also ran this script:

$ grep cpu0 /proc/stat; sleep 20; grep cpu0 /proc/stat
cpu0 10421 0 510 119943 608 0 1 122 0
cpu0 11420 0 510 121942 608 0 1 126 0

The difference in the fourth numerical column is still 1999, even thoughonly 10 seconds of those 20 were spent idly.

Does the experiment miss the point (or do I), or does this disprove theidea?

(Interestingly, according to virt-manager, the load distribution betweenthe VMs looked like:


VM1: 7/16 = 43.75%
VM2: 7/16 = 43.75%
VM3: 2/16 = 1/8 = 12.50%

as if VM3's load had been first extracted and the rest split between VM1and VM2. When I stop VM1 and VM2, VM3 stays at 12.5%. Under the aboveload, I would have expected:


VM1: 8/17 ~= 47.06%
VM2: 8/17 ~= 47.06%
VM3: 1/17 ~= 5.88%

ie. "eight and half" VCPUs sharing the host evenly. Could this have anyrelevance?)


Thank you
Laszlo

Attachment: 50.c
Description: Text document

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

Follow-Ups:
- Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
  - From: Jan Beulich

References:
- Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
  - From: Jan Beulich

Prev by Date: [Xen-devel] [xen-unstable test] 9735: regressions - trouble: broken/fail/pass
Next by Date: [Xen-devel] [PATCH 1/2] xenstore: xenbus cannot be opened read-only
Previous by thread: Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
Next by thread: Re: [Xen-devel] [PATCH] remove blocked time accounting from xen "clockchip"
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.