[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Error during update_runstate_area with KPTI activated

  • To: Julien Grall <julien@xxxxxxx>
  • From: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>
  • Date: Thu, 14 May 2020 16:18:26 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=89N85O5fAQXzsbCHBGrMAcoD6BAuvn5jSOH7dT0hUDQ=; b=ZDr65+t5UKUo7nZqp99M9N7AyfmRXWER+HxyllzoxYV0LzARhtp+nfXirxEKY5McbY9FNw1VNpS4WknIRIA/Y6igFDkSr4EDZQHoCc4oqRuZsYMB6QS1qcla4iac7aOJKnPKfP4OK9XDlhBCIP1XcsIPle6UBHL1ZcLFOZ/9/EugdfJ8P0ZIrJqN0vA25IXYm59gPG77OITOZYYVDkzqFeY5N7FL6JHs3PWpH6P/lTdiY4GsLx/aWtkrhLtLsbv227aK93l+ikf6hFGqbku2xeEIU08561IEv+QYkO/w/WNEzwjWz+RaA4a8N4nNhIQ5Zp6cIg7wlrVa4n025Omlaw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SN6wGkX0M7LPPbZdUVxcyhOjvmoiY1JaPdrAe5+p/6kYsafCyC+PtFp1SIXfGCmnVkfURJzXWwyZjwXepmvBxMhdocJjsTp2vYRbxx+cgUW0k0PN+V+GD07B1q9L7rMLNHx2b9SSArwcPW7hSJB4teXYbFh5eeXPpt7fxXrLb00MwWbC7TdjfVNfa+puuuZySkif2alNvRTFSgfLGvI/+fJQx+DxTYQ9LiWQOoBFRQHuNYzT6x6KZunhwaT9CkfCfWOQ9gLxQ8yVjhbhUK3BXBJZi8/2eoP9sb6ByoKxKHPmdx4ACUVSJa1HK3ilqOxvSCbILebTOHEgvR387aNhnw==
  • Authentication-results: spf=pass (sender IP is smtp.mailfrom=arm.com; lists.xenproject.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;lists.xenproject.org; dmarc=bestguesspass action=none header.from=arm.com;
  • Authentication-results-original: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, nd <nd@xxxxxxx>, Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
  • Delivery-date: Thu, 14 May 2020 16:18:43 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: xen.org; dkim=none (message not signed) header.d=none;xen.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHWKfvm5I6PbMAmoUakTwbdgqMBQqinvLCAgAAF7oA=
  • Thread-topic: Error during update_runstate_area with KPTI activated

> On 14 May 2020, at 16:57, Julien Grall <julien@xxxxxxx> wrote:
> On 14/05/2020 15:28, Bertrand Marquis wrote:
>> Hi,
> Hi,
>> When executing linux on arm64 with KPTI activated (in Dom0 or in a DomU), I 
>> have a lot of walk page table errors like this:
>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
>> After implementing a call trace, I found that the problem was coming from 
>> the update_runstate_area when linux has KPTI activated.
>> I have the following call trace:
>> (XEN) p2m.c:1890: d1v0: Failed to walk page-table va 0xffffff837ebe0cd0
>> (XEN) backtrace.c:29: Stacktrace start at 0x8007638efbb0 depth 10
>> (XEN)    [<000000000027780c>] get_page_from_gva+0x180/0x35c
>> (XEN)    [<00000000002700c8>] guestcopy.c#copy_guest+0x1b0/0x2e4
>> (XEN)    [<0000000000270228>] raw_copy_to_guest+0x2c/0x34
>> (XEN)    [<0000000000268dd0>] domain.c#update_runstate_area+0x90/0xc8
>> (XEN)    [<000000000026909c>] domain.c#schedule_tail+0x294/0x2d8
>> (XEN)    [<0000000000269524>] context_switch+0x58/0x70
>> (XEN)    [<00000000002479c4>] core.c#sched_context_switch+0x88/0x1e4
>> (XEN)    [<000000000024845c>] core.c#schedule+0x224/0x2ec
>> (XEN)    [<0000000000224018>] softirq.c#__do_softirq+0xe4/0x128
>> (XEN)    [<00000000002240d4>] do_softirq+0x14/0x1c
>> Discussing this subject with Stefano, he pointed me to a discussion started 
>> a year ago on this subject here:
>> https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03053.html
>> And a patch was submitted:
>> https://lists.xenproject.org/archives/html/xen-devel/2019-05/msg02320.html
>> I rebased this patch on current master and it is solving the problem I have 
>> seen.
>> It sounds to me like a good solution to introduce a 
>> VCPUOP_register_runstate_phys_memory_area to not depend on the area actually 
>> being mapped in the guest when a context switch is being done (which is 
>> actually the problem happening when a context switch is trigger while a 
>> guest is running in EL0).
>> Is there any reason why this was not merged at the end ?
> I just skimmed through the thread to remind myself the state. AFAICT, this is 
> blocked on the contributor to clarify the intended interaction and provide a 
> new version.

What do you mean here by intended interaction ? How the new hyper call should 
be used by the guest OS ?

> I am still in favor of the new hypercall (and still in my todo list) but I 
> haven't yet found time to revive the series.
> Would you be willing to take over the series? I would be happy to bring you 
> up to speed and provide review.

Sure I can take it over.

I ported it to master version of xen and I tested it on a board.
I still need to do a deep review of the code myself but I have an understanding 
of the problem and what is the idea.

Any help to get on speed would be more then welcome :-)




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.