[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Error during update_runstate_area with KPTI activated



On Thu, 14 May 2020 at 19:12, Andrew Cooper <andrew.cooper3@xxxxxxxxxx> wrote:
>
> On 14/05/2020 18:38, Julien Grall wrote:
> > Hi,
> >
> > On 14/05/2020 17:18, Bertrand Marquis wrote:
> >>
> >>
> >>> On 14 May 2020, at 16:57, Julien Grall <julien@xxxxxxx> wrote:
> >>>
> >>>
> >>>
> >>> On 14/05/2020 15:28, Bertrand Marquis wrote:
> >>>> Hi,
> >>>
> >>> Hi,
> >>>
> >>>> When executing linux on arm64 with KPTI activated (in Dom0 or in a
> >>>> DomU), I have a lot of walk page table errors like this:
> >>>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va
> >>>> 0xffffff837ebe0cd0
> >>>> After implementing a call trace, I found that the problem was
> >>>> coming from the update_runstate_area when linux has KPTI activated.
> >>>> I have the following call trace:
> >>>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va
> >>>> 0xffffff837ebe0cd0
> >>>> (XEN) backtrace.c:29: Stacktrace start at 0x8007638efbb0 depth 10
> >>>> (XEN)    [<000000000027780c>] get_page_from_gva+0x180/0x35c
> >>>> (XEN)    [<00000000002700c8>] guestcopy.c#copy_guest+0x1b0/0x2e4
> >>>> (XEN)    [<0000000000270228>] raw_copy_to_guest+0x2c/0x34
> >>>> (XEN)    [<0000000000268dd0>] domain.c#update_runstate_area+0x90/0xc8
> >>>> (XEN)    [<000000000026909c>] domain.c#schedule_tail+0x294/0x2d8
> >>>> (XEN)    [<0000000000269524>] context_switch+0x58/0x70
> >>>> (XEN)    [<00000000002479c4>] core.c#sched_context_switch+0x88/0x1e4
> >>>> (XEN)    [<000000000024845c>] core.c#schedule+0x224/0x2ec
> >>>> (XEN)    [<0000000000224018>] softirq.c#__do_softirq+0xe4/0x128
> >>>> (XEN)    [<00000000002240d4>] do_softirq+0x14/0x1c
> >>>> Discussing this subject with Stefano, he pointed me to a discussion
> >>>> started a year ago on this subject here:
> >>>> https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03053.html
> >>>>
> >>>> And a patch was submitted:
> >>>> https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg02320.html
> >>>>
> >>>> I rebased this patch on current master and it is solving the
> >>>> problem I have seen.
> >>>> It sounds to me like a good solution to introduce a
> >>>> VCPUOP_register_runstate_phys_memory_area to not depend on the area
> >>>> actually being mapped in the guest when a context switch is being
> >>>> done (which is actually the problem happening when a context switch
> >>>> is trigger while a guest is running in EL0).
> >>>> Is there any reason why this was not merged at the end ?
> >>>
> >>> I just skimmed through the thread to remind myself the state.
> >>> AFAICT, this is blocked on the contributor to clarify the intended
> >>> interaction and provide a new version.
> >>
> >> What do you mean here by intended interaction ? How the new hyper
> >> call should be used by the guest OS ?
> >
> > From what I remember, Jan was seeking clarification on whether the two
> > hypercalls (existing and new) can be called together by the same OS
> > (and make sense).
> >
> > There was also the question of the handover between two pieces of
> > sotfware. For instance, what if the firmware is using the existing
> > interface but the OS the new one? Similar question about Kexecing a
> > different kernel.
> >
> > This part is mostly documentation so we can discuss about the approach
> > and review the implementation.
> >
> >>
> >>>
> >>> I am still in favor of the new hypercall (and still in my todo list)
> >>> but I haven't yet found time to revive the series.
> >>>
> >>> Would you be willing to take over the series? I would be happy to
> >>> bring you up to speed and provide review.
> >>
> >> Sure I can take it over.
> >>
> >> I ported it to master version of xen and I tested it on a board.
> >> I still need to do a deep review of the code myself but I have an
> >> understanding of the problem and what is the idea.
> >>
> >> Any help to get on speed would be more then welcome :-)
> > I would recommend to go through the latest version (v3) and the
> > previous (v2). I am also suggesting v2 because I think the split was
> > easier to review/understand.
> >
> > The x86 code is probably what is going to give you the most trouble as
> > there are two ABIs to support (compat and non-compat). If you don't
> > have an x86 setup, I should be able to test it/help write it.
> >
> > Feel free to ask any questions and I will try my best to remember the
> > discussion from last year :).
>
> At risk of being shouted down again, a new hypercall isn't necessarily
> necessary, and there are probably better ways of fixing it.
>
> The underlying ABI problem is that the area is registered by virtual
> address.  The only correct way this should have been done is to register
> by guest physical address, so Xen's updating of the data doesn't
> interact with the guest pagetable settings/restrictions.  x86 suffers
> the same kind of problems as ARM, except we silently squash the fallout.
>
> The logic in Xen is horrible, and I would really rather it was deleted
> completely, rather than to be kept for compatibility.
>
> The runstate area is always fixed kernel memory and doesn't move.  I
> believe it is already restricted from crossing a page boundary, and we
> can calculate the va=>pa translation when the hypercall is made.
>
> Yes - this is a technically ABI change, but nothing is going to break
> (AFAICT) and the cleanup win is large enough to make this a *very*
> attractive option.

I suggested this approach two years ago [1] but you were the one
saying that buffer could cross page-boundary on older Linux [2]:

"I'd love to do this, but we cant.  Older Linux used to have a virtual
buffer spanning a page boundary.  Changing the behaviour under that will
cause older setups to explode."

So can you explain your change of heart here?

>
> I would prefer to fix it like this, (perhaps adding a new hypercall
> which explicitly takes a guest physical address), than to keep any of
> this mess around forever more to cope with legacy guests.

What does legacy guests mean? Is it PV 32-bit or does it also include some HVM?

Cheers,

[1] <3a77a293-1a29-42ed-8fc0-a74bda213b92@xxxxxxx>
[2] <dc80422f-80bb-bd37-ed41-bb6559f4d7d8@xxxxxxxxxx>



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.