[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC v2 00/12] xen/x86: use per-vcpu stacks for 64 bit pv domains



On 23/01/18 10:10, Juergen Gross wrote:
> On 23/01/18 10:31, Jan Beulich wrote:
>>>>> On 23.01.18 at 10:24, <jgross@xxxxxxxx> wrote:
>>> On 23/01/18 09:53, Jan Beulich wrote:
>>>>>>> On 23.01.18 at 07:34, <jgross@xxxxxxxx> wrote:
>>>>> On 22/01/18 19:39, Andrew Cooper wrote:
>>>>>> One of my concerns is that this patch series moves further away from the
>>>>>> secondary goal of my KAISER series, which was to have the IDT and GDT
>>>>>> mapped at the same linear addresses on every CPU so a) SIDT/SGDT don't
>>>>>> leak which CPU you're currently scheduled on into PV guests and b) the
>>>>>> context switch code can drop a load of its slow instructions like LGDT
>>>>>> and the VMWRITEs to update the VMCS.
>>>>> The GDT address of a PV vcpu is depending on vcpu_id only. I don't
>>>>> see why the IDT can't be mapped to the same address on each cpu with
>>>>> my approach.
>>>> You're not introducing a per-CPU range in the page tables afaics
>>>> (again from overview and titles only), yet with the IDT needing
>>>> to be per-CPU you'd also need a per-CPU range to map it to if
>>>> you want to avoid the LIDT as well as exposing what CPU you're
>>>> on (same goes for the GDT and the respective avoidance of LGDT
>>>> afaict).
>>> After a quick look I don't see why a Meltdown mitigation can't use
>>> the same IDT for all cpus: the only reason I could find for having
>>> per-cpu IDTs seems to be in SVM code, so it seems to be AMD specific.
>>> And AMD won't need XPTI at all.
>> Isn't your RFC series allowing XPTI to be enabled even on AMD?
> Yes, you are right. This might either want to be revisited or the
> address space to be activated for SVM domains could map an IDT with
> IST related traps removed.

I've experimented quite a lot in this area.  Ideally, we'd vmload/save
in the SVM critical region (like all other hypervisors) at which point
we don't need any adjustments to the IDT (as IST references are safe to
use), and we'd catch stack overflows in the #DF handler rather than
immediately triple faulting.

Using LIDT to switch between alternative IDTs, or INVLPG to swap the
mapping under a fixed linear address are both much slower than the
current implementation.

>
>>> The GDT of pv domains is already in the per-domain region even without
>>> my patches, so I don't have to change anything regarding usage of LGDT.
>> Andrew's point was that eliminating the LGDT is a secondary goal.
> With per-cpu mappings this is surely an obvious optimization. In the
> end the overall performance should be taken as base for a decision.
> His main point was avoiding exposing data like the physical cpu number
> and this doesn't apply here, as the GDT is per vcpu in my case.

The GDT leaks vcpu_id into guest userspace, which is similarly problematic.

The secondary goals of my KAISER series stand irrespective of the
Meltdown issues:
* The stack and mutable critical structures really should be numa-local
to the CPU using it.
* The GDT should sit fully fat over zeros.  At the moment in HVM
context, there are 14 frames of arbitrary directmap living within the
GDT limit.
* The IDT/GDT should exist at the same linear address on every pcpu to
avoid leaking information  (This property is what allows the removal of
the lgdt from the context switch path).
* The critical datastructures should be mapped read only to make
exploitation hardware for an attacker with a write-primative.
* With the stack at the same linear address on each CPU, we don't need
the syscall stubs, and the TSS is identical on all cpus.

In some copious free time, it would be nice to fix these issues.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.