[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Error during update_runstate_area with KPTI activated

To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
From: Julien Grall <julien@xxxxxxx>
Date: Thu, 14 May 2020 18:38:59 +0100
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd <nd@xxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
Delivery-date: Thu, 14 May 2020 17:39:03 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

On 14/05/2020 17:18, Bertrand Marquis wrote:

On 14 May 2020, at 16:57, Julien Grall <julien@xxxxxxx> wrote:



On 14/05/2020 15:28, Bertrand Marquis wrote:

Hi,

Hi,

When executing linux on arm64 with KPTI activated (in Dom0 or in a DomU), I 
have a lot of walk page table errors like this:
(XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
After implementing a call trace, I found that the problem was coming from the 
update_runstate_area when linux has KPTI activated.
I have the following call trace:
(XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
(XEN) backtrace.c:29: Stacktrace start at 0x8007638efbb0 depth 10
(XEN)    [<000000000027780c>] get_page_from_gva+0x180/0x35c
(XEN)    [<00000000002700c8>] guestcopy.c#copy_guest+0x1b0/0x2e4
(XEN)    [<0000000000270228>] raw_copy_to_guest+0x2c/0x34
(XEN)    [<0000000000268dd0>] domain.c#update_runstate_area+0x90/0xc8
(XEN)    [<000000000026909c>] domain.c#schedule_tail+0x294/0x2d8
(XEN)    [<0000000000269524>] context_switch+0x58/0x70
(XEN)    [<00000000002479c4>] core.c#sched_context_switch+0x88/0x1e4
(XEN)    [<000000000024845c>] core.c#schedule+0x224/0x2ec
(XEN)    [<0000000000224018>] softirq.c#__do_softirq+0xe4/0x128
(XEN)    [<00000000002240d4>] do_softirq+0x14/0x1c
Discussing this subject with Stefano, he pointed me to a discussion started a 
year ago on this subject here:
https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03053.html
And a patch was submitted:
https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg02320.html
I rebased this patch on current master and it is solving the problem I have 
seen.
It sounds to me like a good solution to introduce a 
VCPUOP_register_runstate_phys_memory_area to not depend on the area actually 
being mapped in the guest when a context switch is being done (which is 
actually the problem happening when a context switch is trigger while a guest 
is running in EL0).
Is there any reason why this was not merged at the end ?


I just skimmed through the thread to remind myself the state. AFAICT, this is 
blocked on the contributor to clarify the intended interaction and provide a 
new version.


What do you mean here by intended interaction ? How the new hyper call should 
be used by the guest OS ?

From what I remember, Jan was seeking clarification on whether the twohypercalls (existing and new) can be called together by the same OS (andmake sense).

There was also the question of the handover between two pieces ofsotfware. For instance, what if the firmware is using the existinginterface but the OS the new one? Similar question about Kexecing adifferent kernel.

This part is mostly documentation so we can discuss about the approachand review the implementation.


I am still in favor of the new hypercall (and still in my todo list) but I 
haven't yet found time to revive the series.

Would you be willing to take over the series? I would be happy to bring you up 
to speed and provide review.


Sure I can take it over.

I ported it to master version of xen and I tested it on a board.
I still need to do a deep review of the code myself but I have an understanding 
of the problem and what is the idea.

Any help to get on speed would be more then welcome :-)

I would recommend to go through the latest version (v3) and the previous(v2). I am also suggesting v2 because I think the split was easier toreview/understand.

The x86 code is probably what is going to give you the most trouble asthere are two ABIs to support (compat and non-compat). If you don't havean x86 setup, I should be able to test it/help write it.

Feel free to ask any questions and I will try my best to remember thediscussion from last year :).


Cheers,

--
Julien Grall

Follow-Ups:
- Re: Error during update_runstate_area with KPTI activated
  - From: Andrew Cooper

References:
- Error during update_runstate_area with KPTI activated
  - From: Bertrand Marquis
- Re: Error during update_runstate_area with KPTI activated
  - From: Julien Grall
- Re: Error during update_runstate_area with KPTI activated
  - From: Bertrand Marquis

Prev by Date: Re: [PATCH v5 14/21] libxl: require qemu in dom0 even if stubdomain is in use
Next by Date: [xen-unstable-smoke test] 150182: tolerable all pass - PUSHED
Previous by thread: Re: Error during update_runstate_area with KPTI activated
Next by thread: Re: Error during update_runstate_area with KPTI activated
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.