Xen project Mailing List

Re: [PATCH] x86/vmx: Revert "x86/VMX: sanitize rIP before re-entering guest"

To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 15 Oct 2020 10:01:28 +0200

Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>

Delivery-date: Thu, 15 Oct 2020 08:01:45 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 14.10.2020 15:57, Andrew Cooper wrote: > On 13/10/2020 16:58, Jan Beulich wrote: >> On 09.10.2020 17:09, Andrew Cooper wrote: >>> At the time of XSA-170, the x86 instruction emulator really was broken, and >>> would allow arbitrary non-canonical values to be loaded into %rip. This was >>> fixed after the embargo by c/s 81d3a0b26c1 "x86emul: limit-check branch >>> targets". >>> >>> However, in a demonstration that off-by-one errors really are one of the >>> hardest programming issues we face, everyone involved with XSA-170, myself >>> included, mistook the statement in the SDM which says: >>> >>> If the processor supports N < 64 linear-address bits, bits 63:N must be >>> identical >>> >>> to mean "must be canonical". A real canonical check is bits 63:N-1. >>> >>> VMEntries really do tolerate a not-quite-canonical %rip, specifically to >>> cater >>> to the boundary condition at 0x0000800000000000. >>> >>> Now that the emulator has been fixed, revert the XSA-170 change to fix >>> architectural behaviour at the boundary case. The XTF test case for XSA-170 >>> exercises this corner case, and still passes. >>> >>> Fixes: ffbbfda377 ("x86/VMX: sanitize rIP before re-entering guest") >>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> But why revert the change rather than fix ... >> >>> @@ -4280,38 +4280,6 @@ void vmx_vmexit_handler(struct cpu_user_regs *regs) >>> out: >>> if ( nestedhvm_vcpu_in_guestmode(v) ) >>> nvmx_idtv_handling(); >>> - >>> - /* >>> - * VM entry will fail (causing the guest to get crashed) if rIP (and >>> - * rFLAGS, but we don't have an issue there) doesn't meet certain >>> - * criteria. As we must not allow less than fully privileged mode to >>> have >>> - * such an effect on the domain, we correct rIP in that case (accepting >>> - * this not being architecturally correct behavior, as the injected #GP >>> - * fault will then not see the correct [invalid] return address). >>> - * And since we know the guest will crash, we crash it right away if it >>> - * already is in most privileged mode. >>> - */ >>> - mode = vmx_guest_x86_mode(v); >>> - if ( mode == 8 ? !is_canonical_address(regs->rip) >> ... the wrong use of is_canonical_address() here? By reverting >> you open up avenues for XSAs in case we get things wrong elsewhere, >> including ... >> >>> - : regs->rip != regs->eip ) >> ... for 32-bit guests. > > Because the only appropriate alternative would be ASSERT_UNREACHABLE() > and domain crash. > > This logic corrupts guest state. > > Running with corrupt state is every bit an XSA as hitting a VMEntry > failure if it can be triggered by userspace, but the latter safer and > much more obvious. I disagree. For CPL > 0 we don't "corrupt" guest state any more than reporting a #GP fault when one is going to be reported anyway (as long as the VM entry doesn't fail, and hence the guest won't get crashed). IOW this raising of #GP actually is a precautionary measure to _avoid_ XSAs. Nor do I agree with the "much more obvious" aspect: A VM entry failure requires quite a bit of analysis to recognize what has caused it; whether a non-pseudo-canonical RIP is what catches your eye right away is simply unknown. The gprintk() that you delete, otoh, says very clearly what we have found to be wrong. > It was the appropriate security fix (give or take the functional bug in > it) at the time, given the complexity of retrofitting zero length > instruction fetches to the emulator. > > However, it is one of a very long list of guest-state-induced VMEntry > failures, with non-trivial logic which we assert will pass, on a > fastpath, where hardware also performs the same checks and we already > have a runtime safe way of dealing with errors. (Hence not actually > using ASSERT_UNREACHABLE() here.) "Runtime safe" as far as Xen is concerned, I take it. This isn't safe for the guest at all, as vmx_failed_vmentry() results in an unconditional domain_crash(). I certainly buy the fast path aspect of your comment, and if you were moving the guest state adjustment into vmx_failed_vmentry(), I'd be fine with the deletion here. > It isn't appropriate for this check to exist on its own (i.e. without > other guest state checks), Well, if we run into cases where we get things wrong, more checks and adjustments may want adding. Sadly each one of those has a fair chance of needing an XSA. As an aside, nvmx_n2_vmexit_handler()'s handling of VMX_EXIT_REASONS_FAILED_VMENTRY looks pretty bogus - this is a flag, not a separate exit reason. I guess I'll make a patch ... Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.