[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path

To: Dario Faggioli <dfaggioli@xxxxxxxx>, "julien.grall@xxxxxxx" <julien.grall@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
From: Andrii Anisov <andrii.anisov@xxxxxxxxx>
Date: Tue, 6 Aug 2019 16:09:53 +0300
Cc: "andrii_anisov@xxxxxxxx" <andrii_anisov@xxxxxxxx>, "sstabellini@xxxxxxxxxx" <sstabellini@xxxxxxxxxx>, "Volodymyr_Babchuk@xxxxxxxx" <Volodymyr_Babchuk@xxxxxxxx>
Delivery-date: Tue, 06 Aug 2019 13:09:59 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hello Dario,

Please see my comments below:

On 03.08.19 03:55, Dario Faggioli wrote:

On Fri, 2019-08-02 at 16:07 +0300, Andrii Anisov wrote:

On 02.08.19 12:15, Julien Grall wrote:

  From the list below it is not clear what is the split between
hypervisor time and guest time. See some of the examples below.


I guess your question is *why* do I split hyp/guest time in such a
way.

So for the guest I count time spent in the guest mode. Plus time
spent in hypervisor mode to serve explicit requests by guest.

 From an accuracy, but also from a fairness perspective:
- what a guest does directly (in guest mode)
- what the hypervisor does, on behalf of a guest, no matter whether
requested explicitly or not
should all be accounted to the guest. In the sense that the guest
should be charged for it.


For the interrupts and implicit overhead I'd give an example (for ARM, and a 
bit simplified):

In IRQ trap path there is a function `enter_hypervisor_head()`, what does 
synchronize GIC state of interrupted VCPU to its VGIC representation 
(manipulates peripheral registers, goes through queues, etc.). Lets imagine we 
have running a VCPU which belongs to domain A, and it is interrupted by the int 
belongs to domain B. From what domain budget should be charged 
`enter_hypervisor_head()` execution time?
From budget of domain A? But it was not initiated by domain A.
From budget of domain B? But `enter_hypervisor_head()` execution time depends 
on domain A configuration and workload.

If you see this example as very simple, please add nested interrupts and guest 
switch on returning from hyp. And remember there is some mandatory non-softirq 
work in `leave_hypervisor_tail()`.

Actually, the concepts of "guest time" and "hypervisor time" are
actually orthogonal from the accounting, at least ideally.

In fact, when a guest does an hypercall, the time that we spend inside
Xen for performing the hypercal itself:
* is hypervisor time
* the guest that did the hypercall should be charged for it.

If we don't charge the guest for these activity, in theory, a guest can
start doing a lot of hypercalls and generating a lot of interrupts...
since most of the time is spent in the hypervisor, it's runtime (from
the scheduler point of view) increase only a little, and the scheduler
will continue to run it, and it will continue to generate hypercalls
and interrupts, until it starve/DoS the system!

In fact, this right now can't happen because we always charge guests
for the time spent doing these things. The problem is that we often
charge _the_wrong_ guest. This somewhat manages to prevent (or make it
very unlikely) a DoS situation, but is indeed unfair, and may cause
problems (especially in RT scenarios).

That time may be quite deterministic from the guest's point of view.

But the time spent by hypervisor to handle interrupts, update the
hardware state is not requested by the guest itself. It is a
virtualization overhead.

Yes, but still, when it is the guest that causes such overhead, it is
important that the guest itself gets to pay for it.

Just as an example (although you don't have this problem on ARM), if I
have an HVM, ideally I would charge to the guest the time that QEMU
executes in dom0!

On the other hand, the time that we spend in the scheduler, for
instance, doing load balancing among the various runqueues, or the time
that we spend in Xen (on x86) for time synchronization rendezvouses,
they should not be charged to any guest.

And the overhead heavily depends on the system configuration (e.g.
how many guests are running).
That overhead may be accounted for a guest or for hyp, depending on
the model agreed.

Load balancing within the scheduler, indeed depends on how busy the
system is, and I agree that time should be accounted against any guest.


Agree.

Saving and restoring the register state of a guest, I don't think it
depends on how many other guests there are around, and I think should
be accounted against the guest itself.


I'd brief saving restoring register state of a guest to the guest context 
switch.
So one particular guest context switch does not depend on the system load. But 
the number of context switches directly depends on how many guests are 
available and how busy they are.
So the guest context switch overhead varies and might affect sensitive guests 
if they are charged for it.

My idea is as following:
Accounting that overhead for guests is quite OK for server
applications, you put server overhead time on guests and charge money
from their budget.

I disagree. The benefits of more accurate and correct time accounting
and charging are not workload or use case dependent.


I would agree, for the ideal system.
But more accurate and correct time accounting and charging is more expensive in 
runtime, or, sometimes, impossible to implement.
This causes the fact that "time accounting and charging" is use case dependent.

If we decide to
charge the guest for hypercalls it does and interrupts it receives,
then we should do that, both for servers and for embedded RT systems.
As said, I believe this is one of those cases, where we want an unified
approach.


I'm totally agree with guests charged for hypercalls. But I doubt interrupts.

And not because it's easier, or because "Xen has to work both
on servers and embedded" (which, BTW, is true). But because it is the
right thing to do, IMO.


p.s. I've spent some more time looking through Linux kernel, time accounting 
implementation. They have IRQ time accounting configurable. Here is their 
justification [1].
p.p.s. I'm looking through freertos as well to get wider look on the available 
approaches

[1] http://lkml.iu.edu/hypermail//linux/kernel/1010.0/01175.html

--
Sincerely,
Andrii Anisov.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

Follow-Ups:
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Andrii Anisov

References:
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Andrii Anisov
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Dario Faggioli
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Andrii Anisov
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Julien Grall
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Andrii Anisov
- Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
  - From: Dario Faggioli

Prev by Date: [Xen-devel] [PATCH v5 06/10] AMD/IOMMU: don't blindly allocate interrupt remapping tables
Next by Date: [Xen-devel] [PATCH v5 07/10] AMD/IOMMU: make phantom functions share interrupt remapping tables
Previous by thread: Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
Next by thread: Re: [Xen-devel] [RFC 1/6] xen/arm: Re-enable interrupt later in the trap path
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.