[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/4] x86/xstate: Rework XSAVE/XRSTOR given a newer toolchain baseline


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 8 Jan 2026 21:08:48 +0000
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ufw209kKfDJMc/AykLWlB6kEa/Uoa9Oo8aZrdexWpLE=; b=zJA8Iql3cNEwGLfhG2XSXODqC03ORS/ENv/+1Q3yq4f/DdyZgIepUZ+9WzTsJEMm0+GfWRQ+imaFfNNYIez3AYFDGrx6I/VhoPfRCvwQv8NAuSUIb3lZIs7Y9/4eKtgXaDK4iM+V6F4C12XZjZrUnXQGtSt17DTTWtniAJmtxqGp8i6mCt6jM1HKfwxI8T6P9ue+JqCM0LZQEGkaOo3D6gYkPn0ottsNnGbEzGZKwjG/ZxArANCq2GNzf/HisKH72+t1Y/y1xWbQIr5pGaaEattLaZL8V7iwRM/bsy1v3cU7Y0n3OFfNIPeT0XubD0XRToF2BOZE8MQa6dmmiCKelg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Gptx12/RyU+UAJsXor04vXIeyQ2MnGIIvLlgSiu4g5kZN3RvkvDnyIuBm6oOqQU/HILFvMEd3n1R3EQcZfqazkkVUBIvVK1QWAlF/yEljLNcbUuAvrMQLBd+wxQWVh8zO+nkaT5h18GQcfY1wPM7pXL7gNCveAsZ3+tQUE073B5joz6ZLOCDxLUI/4/SFHyaEj/Y4FL3DEmIucL/YxDcuKUBVtwYJudCqKVn6JulEpDiHY+aStgRTuz+5YUPa//BrFl1T06ZqGfd2+hQetQCY2AmN1b+iHvknl2E2dG8nCR0BITsCbrJAHNo8xVWu66OQv89kYxJIfq4/n4dP3FOXQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 08 Jan 2026 21:09:13 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 06/01/2026 7:59 am, Jan Beulich wrote:
>>>> @@ -489,17 +484,17 @@ void xrstor(struct vcpu *v, uint64_t mask)
>>>>              ptr->xsave_hdr.xcomp_bv = 0;
>>>>          }
>>>>          memset(ptr->xsave_hdr.reserved, 0, 
>>>> sizeof(ptr->xsave_hdr.reserved));
>>>> -        continue;
>>>> +        goto retry;
>>>>  
>>>>      case 2: /* Stage 2: Reset all state. */
>>>>          ptr->fpu_sse.mxcsr = MXCSR_DEFAULT;
>>>>          ptr->xsave_hdr.xstate_bv = 0;
>>>>          ptr->xsave_hdr.xcomp_bv = v->arch.xcr0_accum & XSTATE_XSAVES_ONLY
>>>>              ? XSTATE_COMPACTION_ENABLED : 0;
>>>> -        continue;
>>>> -    }
>>>> +        goto retry;
>>>>  
>>>> -        domain_crash(current->domain);
>>>> +    default: /* Stage 3: Nothing else to do. */
>>>> +        domain_crash(v->domain, "Uncorrectable XRSTOR fault\n");
>>>>          return;
>>> There's an unexplained change here as to which domain is being crashed.
>>> You switch to crashing the subject domain, yet if that's not also the
>>> requester, it isn't "guilty" in causing the observed fault.
>> So dom0 should be crashed because there bad data in the migration stream?
> Well, I'm not saying the behavior needs to stay like this, or that's it's
> the best of all possible options. But in principle Dom0 could sanitize the
> migration stream before passing it to Xen. So it is still first and foremost
> Dom0 which is to blame.

BNDCFGU contains a pointer which, for PV context, needs access_ok(), not
just a regular canonical check.  Most supervisor states are in a similar
position.

Just because Xen has managed to get away without such checks (by not yet
supporting a state where it matters), I don't agree that its safe to
trust dom0 to do this.


For this case, it's v's xstate buffer which cannot be loaded, so it's v
which cannot be context switched into, and must be crashed.  More below.


>> v is always curr.
> Not quite - see xstate_set_init().

Also more below.

> And for some of the callers of
> hvm_update_guest_cr() I also don't think they always act on current. In
> particular hvm_vcpu_reset_state() never does, I suppose (not the least
> because of the vcpu_pause() in its sole caller).

We discussed the need to not be remotely poking register state like
that.  But I don't see where the connection is between
hvm_update_guest_cr() and xsave()/xrstor().

Tangent: hvm_vcpu_reset_state() is terribly named as it's attempting to
put the vCPU into the INIT state, not the #RESET set.

But it only operates on the xstate header in memory while the target is
de-scheduled.  It's not using XSAVE/XRSTOR to load the results into
registers as far as I can tell.

>
>>   XRSTOR can't be used correctly outside of the subject context,
> Then are you suggesting e.g. xstate_set_init() is buggy?

No, but it switches into enough of v's context to function.  Really its
neither current nor remote context.

But, it's single caller is adjust_bnd() in the emulator so it's always
actually current context with a no-op on xcr0.

As said on Matrix, I think it's going to be necessary to remove MPX to
continue the XSAVE cleanup.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.