[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH FAIRLY-RFC 00/44] x86: Prerequisite work for a Xen KAISER solution

On 05/01/2018 07:48, Juergen Gross wrote:
> On 04/01/18 21:21, Andrew Cooper wrote:
>> This work was developed as an SP3 mitigation, but shelved when it became 
>> clear
>> that it wasn't viable to get done in the timeframe.
>> To protect against SP3 attacks, most mappings needs to be flushed while in
>> user context.  However, to protect against all cross-VM attacks, it is
>> necessary to ensure that the Xen stacks are not mapped in any other cpus
>> address space, or an attacker can still recover at least the GPR state of
>> separate VMs.
> Above statement is too strict: it would be sufficient if no stacks of
> other domains are mapped.

Sadly not.  Having stacks shared by domain means one vcpu can still
steal at least GPR state from other vcpus belonging to the same domain.

Whether or not a specific kernel cares, some definitely will.

> I'm just working on a proof of concept using dedicated per-vcpu stacks
> for 64 bit pv domains. Those stacks would be mapped in the per-domain
> region of the address space. I hope to have a RFC version of the patches
> ready next week.
> This would allow to remove the per physical cpu mappings in the guest
> visible address space when doing page table isolation.
> In order to avoid SP3 attacks to other vcpu's stacks of the same guest
> we could extend the pv ABI to mark a guest's user L4 page table as
> "single use", i.e. not allowed to be active on multiple vcpus at the
> same time (introducing that ABI modification in the Linux kernel would
> be simple, as the Linux kernel currently lacks support for cross-cpu
> stack exploits and when that support is being added by per-cpu L4 user
> page tables we could just chime in). A L4 page table marked as "single
> use" would map the local vcpu stacks only.

For PV guests, it is the Xen stacks which matter, not the vcpu guest
kernel's ones.

64bit PV guest kernels are already mitigated better than KPTI can ever
manage, because there are no entry stacks or entry stubs required to be
mapped into guest userspace at all.

>> To have isolated stacks, Xen needs a per-pcpu isolated region, which requires
>> that two pCPUs never share the same %cr3.  This is trivial for 32bit PV 
>> guests
>> and HVM guests due to the existing per-vcpu Monitor Tables, but is 
>> problematic
>> for 64bit PV guests, which will run on the same %cr3 when scheduling 
>> different
>> threads from the same process.
>> To avoid breaking the PV ABI, Xen needs to shadow the guest L4 pagetables if
>> it wants to maintain the unique %cr3 property it needs.
>> tl;dr The shadowing algorithm in pt-shadow.c is too much of a performance
>> overhead to be viable, and very high risk to productise in an embargo window.
>> If we want to continue down this route, we either need someone to have a
>> clever alternative to the shadowing algorithm I came up with, or change the 
>> PV
>> ABI to require VMs not to share L4 pagetables.
>> Either way, these patches are presented to start a discussion of the issues.
>> The series as a whole is not in a suitable state for committing.
> I think patch 1 should be excluded from that statement, as it is not
> directly related to the series.

There are bits of the series I do intend to take in, largely in this
form.  Another is "x86/pv: Drop support for paging out the LDT" because
its long-since time for that to disappear.

I should also say that the net changes to context switch and
critical-structure handling across this series is a performance and
security benefit, irrespective of the KAISER/KPTI side of things. 
They'd qualify for inclusion on their own merits alone (if it weren't
for the dependent L4 shadowing issues).

If you're interested, I stumbled onto patch one after introducing the
per-pcpu stack mapping, as virt_to_maddr() came out spectacularly
wrong.  Very observant readers might also notice the bit of misc
debugging which caused me to blindly stumble into XSA-243, which was an
interesting diversion from Xen crashing because of my own pagetable


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.