[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] xen/arm: Convert runstate address during hypcall



Hi Bertrand,

On 31/07/2020 14:16, Bertrand Marquis wrote:


On 30 Jul 2020, at 22:50, Julien Grall <julien@xxxxxxx> wrote:
On 30/07/2020 11:24, Bertrand Marquis wrote:
At the moment on Arm, a Linux guest running with KTPI enabled will
cause the following error when a context switch happens in user mode:
(XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
The error is caused by the virtual address for the runstate area
registered by the guest only being accessible when the guest is running
in kernel space when KPTI is enabled.
To solve this issue, this patch is doing the translation from virtual
address to physical address during the hypercall and mapping the
required pages using vmap. This is removing the conversion from virtual
to physical address during the context switch which is solving the
problem with KPTI.

To echo what Jan said on the previous version, this is a change in a stable ABI 
and therefore may break existing guest. FAOD, I agree in principle with the 
idea. However, we want to explain why breaking the ABI is the *only* viable 
solution.

 From my understanding, it is not possible to fix without an ABI breakage 
because the hypervisor doesn't know when the guest will switch back from 
userspace to kernel space. The risk is the information provided by the runstate 
wouldn't contain accurate information and could affect how the guest handle 
stolen time.

Additionally there are a few issues with the current interface:
   1) It is assuming the virtual address cannot be re-used by the userspace. 
Thanksfully Linux have a split address space. But this may change with KPTI in 
place.
   2) When update the page-tables, the guest has to go through an invalid 
mapping. So the translation may fail at any point.

IOW, the existing interface can lead to random memory corruption and inacurracy 
of the stolen time.

I agree but i am not sure what you want me to do here.
Should i add more details in the commit message ?


This is done only on arm architecture, the behaviour on x86 is not
modified by this patch and the address conversion is done as before
during each context switch.
This is introducing several limitations in comparison to the previous
behaviour (on arm only):
- if the guest is remapping the area at a different physical address Xen
will continue to update the area at the previous physical address. As
the area is in kernel space and usually defined as a global variable this
is something which is believed not to happen. If this is required by a
guest, it will have to call the hypercall with the new area (even if it
is at the same virtual address).
- the area needs to be mapped during the hypercall. For the same reasons
as for the previous case, even if the area is registered for a different
vcpu. It is believed that registering an area using a virtual address
unmapped is not something done.

This is not clear whether the virtual address refer to the current vCPU or the 
vCPU you register the runstate for. From the past discussion, I think you refer 
to the former. It would be good to clarify.

Ok i will try to clarify.


Additionally, all the new restrictions should be documented in the public 
interface. So an OS developper can find the differences between the 
architectures.

To answer Jan's concern, we certainly don't know all the guest OSes existing, 
however we also need to balance the benefit for a large majority of the users.

 From previous discussion, the current approach was deemed to be acceptable on 
Arm and, AFAICT, also x86 (see [1]).

TBH, I would rather see the approach to be common. For that, we would an 
agreement from Andrew and Jan in the approach here. Meanwhile, I think this is 
the best approach to address the concern from Arm users.

 From this I get that you want me to document the specific behaviour on Arm on 
the public header describing the hypercall, right ?

Yes please. The public header is usually where an OS developper will look for details. Although, at the moment, the documentation is not very great as you often have to dig in Xen code to understand how it is meant to work :(. But we are trying to improve that.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.