Re: [Xen-devel] [RFC 16/16] xen/arm: Track page accessed between batch of Set/Way operations

Hi Jan,

On 09/10/2018 12:43, Jan Beulich wrote:
On 09.10.18 at 12:16, <julien.grall@xxxxxxx> wrote:
On 09/10/2018 08:04, Jan Beulich wrote:
On 08.10.18 at 20:33, <julien.grall@xxxxxxx> wrote:
This patch adds a new per-arch helper is introduced to perform actions just
before the guest is first unpaused. This will be used to invalidate the
P2M to track access from the start of the guest.

While I'm not opposed to the new arch hook, why don't you create the
p2m entries in their intended state right away? At the very least this
would have the benefit of confining the entire change to Arm code.

Let me start by saying I think having a hook to perform an action once
the VM has been fully created is quite useful. For instance, this could
be used on Arm to limit the invalidation of the icache. At the moment,
we invalidate the icache for every populate memory hypercall. This is
quite a waste of cycle.

As said - I'm not opposed to such a hook in principle, but I'd like
to understand the reasons (and in particular whether there's an
alternative without introducing such a hook).

In this particular circumstance, I would still like to use the hardware
for walking the page-tables during the domain creation (i.e when copy
binary over). This would not be possible if we create the entry with
valid bit unset.

This must be something Arm specific, since afaiu we're talking about
arbitrary domain creation here, not just Dom0. On x86 it would be
basically impossible to re-use the page tables created for the guest
to access guest memory from the control domain. (It could be made
work by inserting sub-trees into the control domain's page tables,
but obviously there would be a fair chance of conflict between the
virtual addresses the control domain uses for its own purposes and
the ones where the destination range in the domain being created

Well we don't share sub-trees on Arm, yet have the valid set from the starting is still useful if you want to use the hardware for translate a guest address (e.g by switch between page-tables). This avoid the software lookup.

Furthermore, we don't need to create entry with valid bit unset once the
guest is running. So we would need to check in the P2M code whether the
guest is running and whether IOMMU is enabled.

Well, looking at the patch context of your change, it is quite clear
that this would be pretty easy - simply taking d->creation_finished
into account.

I never said I couldn't use d->creation_finished. It is possible to spread it everywhere if we want to. But what's the point when it can be done in a single place?

Furthermore, as I suggested at the beginning of my previous answer, I can see other usage for this new hook. I am quite surprised you don't see any benefits on x86 too.

For instance looking at the memory subsystem, it would be possible to defer the TLB flush until the domain actually first run.


Julien Grall

