Re: [Xen-devel] XSAVE IRC thread

Ben Guthro <Ben.Guthro@xxxxxxxxxx> 05/04/13 1:14 PM
On May 3, 2013, at 10:58 AM, Jan Beulich <JBeulich@xxxxxxxx>:
>> Attached a patch that I think should address this problem. It's
>> against the tip of the staging tree, and doesn't apply without
>> adjustment to 4.2 (and making it work for 4.1 would be quite a
>> bit more work) - please let me know whether that's sufficient for
>> you testing this, or whether you need me to do any backporting.
>> I didn't verify this with any Windows, but since the same issue
>> can - if one is looking for it - be observed on PV Linux, I did verify
>> the patch to help there.
>> I'd like to note though that while this is expected to help with
>> 32-bit guests, and with a 64-bit guest kernels doing such checking
>> after using the respective save (and possibly restore) instructions
>> with a 64-bit operand size, the hypervisor has no way of knowing
>> whether the context actually belongs to a 32-bit process while the
>> guest is in kernel (64-bit) mode. That means that from a 32-bit
>> app's perspective, inconsistencies could still be observed under
>> certain conditions (but the case where the hypervisor side save
>> happens after a VM exit from user mode should also work with
>> that patch). I don't see any way to hide that, other than running
>> on CPUs that don't save/restore the selector values at all
>> anymore (Intel at least has a feature bit for this).
>Thank you very much for looking into this so quickly.
>Our QA infrastructure is currently set up for testing against our XenClient 
>product built on Xen 4.2.2.
>Since this is an intermittent failure, in order to reduce the number of 
>variables in testing this solution,
>I'll look into backporting this on Mon to 4.2, and report back after a night, 
>or two of testing.

Meanwhile I realized that the patch will need a little further adjustment:
Parts of the xsave modifications need to/can become conditional upon
FPU state being saved/restored (which in particular may not be the case
during the eager restore phase needed for AMD LWP, but which otherwise
would also be a latent bug). This shouldn't affect testing of what I sent
on Friday, though.

Furthermore I meanwhile also mentally restored what I had done on native
Linux years ago, which would permit determining the needed save/restore
layout (32 vs 64 bit) regardless of current guest execution mode. That
would come with a price, though, since some parts of the save operation
would need to be done twice (needing either two fxsave-s/xsave-s, or an
additional fnstenv) - will need to do some measurements to see how bad
this would turn out, and which of the two would be preferable. (This
implicitly also tells me that native Linux ought to have the same issue when
running 32-bit Windows in this verifier mode on 64-bit KVM. Kirk - did you
ever run across FPU state inconsistency bug checks in the WHQL testing on


