[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
  • Date: Thu, 25 Feb 2021 12:51:00 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=emP4aFwAMAvIsnQEsPzj4VqUxQYACY1/A2oh3xxnAis=; b=ELF9JQoKBpT+/yY+rFz5/lMbeFqi9TrjINogNm/Yo0xj2/bg77AjKSnDXUqtp8qWcvLRp3AmpTTALSfCpJBnjDIF+irxKpRW8tVteX6zPUKQr2i5u+7otdUFhyJWP6wjfKxLyNid7fRbaMFh6U85a1SncFcus3IE4jCCylT2rSVJazfxZcKx1i5lrLrzfb17TTyQfceXrA/jztpT3LXLLfJNymJvrpd8vENDORAm0llNql9FQInyQ+yDFLTVfO86Yq6hGaVHo9PbqWRb/kTvpiw8UF2vau54ck3nLraylPbk1F9kyHBi+dWJiZiuY8AD5QqAIAEFAKWl4/iMQPiofg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e4CSd6q+nBVh3qwYKL+Pk8TSuULiLAhx8OZyy2tLZZ3/b2AjiqWqDclsuZDYr5k8oDK/ot6nKBMTVVfAdssOQ0CsB0uVZuH94+tqbAcqZBmzBahgtH3rfrJTbs19Xn3J9WP6PlVDfmImFgu55eZXXj8h9trR6a4tJEsDR70fKTJQlhbzRzmmJ+L06m6FgjVjQva+BQwiG01BaTO88w1siNByUYKKx0HfziJClqh4FGvIH4p1BsrwmGsxnad5I1fLExM009bhBLM2olNKJzFVYJmYb1qVrNHN1qJAp9sgV2iEdJ3cBn5cyhZb4VwQyfjQEDbwCZcPRmg6MiN8aGWzcg==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=epam.com;
  • Cc: Julien Grall <julien.grall.oss@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 25 Feb 2021 12:51:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHXCYx4A6OUUHr1gkqxWv1TEOLkuqplchSAgAAzdYCAAXE/AIAAtX0AgAAaNQCAABhvgIAAC3gAgADMQgA=
  • Thread-topic: [RFC PATCH 00/10] Preemption in hypervisor (ARM only)

Hi Andrew,

Andrew Cooper writes:

> On 24/02/2021 23:58, Volodymyr Babchuk wrote:
>> And I am not mentioning x86 support there...
>
> x86 uses per-pCPU stacks, not per-vCPU stacks.
>
> Transcribing from an old thread which happened in private as part of an
> XSA discussion, concerning the implications of trying to change this.
>
> ~Andrew
>
> -----8<-----
>
> Here is a partial list off the top of my head of the practical problems
> you're going to have to solve.
>
> Introduction of new SpectreRSB vulnerable gadgets.  I'm really close to
> being able to drop RSB stuffing and recover some performance in Xen.
>
> CPL0 entrypoints need updating across schedule.  SYSCALL entry would
> need to become a stub per vcpu, rather than the current stub per pcpu.
> This requires reintroducing a writeable mapping to the TSS (doable) and
> a shadow stack switch of active stacks (This corner case is so broken it
> looks to be a blocker for CET-SS support in Linux, and is resulting in
> some conversation about tweaking Shstk's in future processors).
>
> All per-cpu variables stop working.  You'd need to rewrite Xen to use
> %gs for TLS which will have churn in the PV logic, and introduce the x86
> architectural corner cases of running with an invalid %gs.  Xen has been
> saved from a large number of privilege escalation vulnerabilities in
> common with Linux and Windows by the fact that we don't use %gs, so
> anyone trying to do this is going to have to come up with some concrete
> way of proving that the corner cases are covered.

Thank you. This is exactly what I needed. I am not a big specialist in
x86, but from what I said, I can see that there is no easy way to switch
contexts while in hypervisor mode.

Then I want to return to a task domain idea, which you mentioned in the
other thread. If I got it right, it would allow to

1. Implement asynchronous hypercalls for cases when there is no reason
to hold calling vCPU in hypervisor for the whole call duration

2. Improve time accounting, as tasklets can be scheduled to run in this
task domain.

I skimmed through ML archives, but didn't found any discussion about it.

As I see it, its implementation would be close to idle domain
implementation, but a little different.

-- 
Volodymyr Babchuk at EPAM


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.