[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: introduce vwfi parameter



Hi Stefano,

I have CCed another ARM person who has more knowledge than me on scheduling/power.

On 02/17/2017 10:50 PM, Stefano Stabellini wrote:
CC'ing xen-devel, I forgot on the original patch

On Fri, 17 Feb 2017, Julien Grall wrote:
Hi Stefano,

On 02/16/2017 11:04 PM, Stefano Stabellini wrote:
Introduce new Xen command line parameter called "vwfi", which stands for
virtual wfi. The default is "sleep": on guest wfi, Xen calls vcpu_block
on the guest vcpu. The behavior can be changed setting vwfi to "idle",
in that case Xen calls vcpu_yield.

The result is strong reduction in irq latency (8050ns -> 3500ns) at the
cost of idle_loop being called less often, leading to higher power
consumption.

Please explain in which context this will be beneficial. My gut feeling is
only will make performance worst if a multiple vCPU of the same guest is
running on vCPU

I am not a scheduler expert, but I don't think so. Let me explain the
difference:

- vcpu_block blocks a vcpu until an event occurs, for example until it
  receives an interrupt

- vcpu_yield stops the vcpu from running until the next scheduler slot

In both cases the vcpus is not run until the next slot, so I don't think
it should make the performance worse in multi-vcpus scenarios. But I can
do some tests to double check.

You still haven't explained how you came up with those number? My guess is 1 vCPU per pCPU but it is not clear from the commit message.

Looking at your answer, I think it would be important that everyone in this thread understand the purpose of WFI and how it differs with WFE.

The two instructions provides a way to tell the processor to go in low-power state. It means the processor can turn off power on some parts (e.g unit, pipeline...) to save energy.

The instruction WFE (Wait For Event) will wait until the processor receives an event. The definition of even is quite wide, could be because of an SEV on another processor or an implementation defined mechanism (see D1.17.1 in DDI 0487A.k_iss10775). An example of use is when a lock is already taken, the processor would use WFE and wait for an even from the processor who will release the lock another processor would release the lock and send an event.

The instruction WFI (Wait for Interrupt) will wait until the processor receives an interrupt. An example of use is when a processor has nothing to run. It would be normal for the software to put the processor in low power mode until an interrupt is coming. The software may not receive interrupt for a while (see the recent example with the RCU bug in Xen where a processor had nothing to do and was staying in lower power mode).

For both instruction it is normal to have an higher latency when receiving an interrupt. When a software is using them, it knows that there will have an impact, but overall it will expect some power to be saved. Whether the current numbers are acceptable is another question.

Now, regarding what you said. Let's imagine the scheduler is descheduling the vCPU until the next slot, it will run the vCPU after even if no interrupt has been received. This is a real waste of power and become worst if an interrupt is not coming for multiple slot.

In the case of multi-vcpu, the guest using wfi will use more slot than it was doing before. This means less slot for vCPUs that actually have real work to do. So yes, this will have an impact for same workload between before your patch and after.



Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
CC: dario.faggioli@xxxxxxxxxx
---
 docs/misc/xen-command-line.markdown | 11 +++++++++++
 xen/arch/arm/traps.c                | 17 +++++++++++++++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown
b/docs/misc/xen-command-line.markdown
index a11fdf9..5d003e4 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1632,6 +1632,17 @@ Note that if **watchdog** option is also specified
vpmu will be turned off.
 As the virtualisation is not 100% safe, don't use the vpmu flag on
 production systems (see http://xenbits.xen.org/xsa/advisory-163.html)!

+### vwfi
+> `= sleep | idle
+
+> Default: `sleep`
+
+WFI is the ARM instruction to "wait for interrupt". This option, which
+is ARM specific, changes the way guest WFI is implemented in Xen. By
+default, Xen blocks the guest vcpu, putting it to sleep. When setting
+vwfi to `idle`, Xen idles the guest vcpu instead, resulting in lower
+interrupt latency, but higher power consumption.

The main point of using wfi is for power saving. With this change, you will
end up in a busy loop and as you said consume more power.

That's not true: the vcpu is still descheduled until the next slot.
There is no busy loop (that would be indeed very bad).

As Dario answered in a separate e-mail this will depend on the scheduler. Regardless that, for me you are still busy looping because you are going from one slot to another slot until maybe sometime interrupt is coming. So yes, the power consumption is much worse if you got a guest vCPU doing nothing.



I don't think this is acceptable even to get a better interrupt latency. Some
workload will care about interrupt latency and power.

I think a better approach would be to check whether the scheduler has another
vCPU to run. If not wait for an interrupt in the trap.

This would save the context switch to the idle vCPU if we are still on the
time slice of the vCPU.

From my limited understanding of how schedulers work, I think this
cannot work reliably. It is the scheduler that needs to tell the
arch-specific code to put a pcpu to sleep, not the other way around. I
would appreciate if Dario could confirm this though.

If my understanding is correct, your workload is 1 vCPU per pCPU. So why do you need to trap WFI/WFE in this case?

Would not it be easier to let the guest using them directly?



Likely this may not fit everyone, so I would add some knowledge to change the
behavior of WFI depending on how many vCPU are scheduled on the current pCPU.
But this could be done as a second step.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.