[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.16] Revert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"



On 25/11/2021 10:43, Roger Pau Monné wrote:
> On Thu, Nov 25, 2021 at 11:25:36AM +0100, Jan Beulich wrote:
>> On 24.11.2021 22:11, Andrew Cooper wrote:
>>> OSSTest has identified a 3rd regression caused by this change.  Migration
>>> between Xen 4.15 and 4.16 on the nocera pair of machines (AMD Opteron 4133)
>>> fails with:
>>>
>>>   xc: error: Failed to set CPUID policy: leaf 00000000, subleaf ffffffff, 
>>> msr ffffffff (22 = Invalid argument): Internal error
>>>   xc: error: Restore failed (22 = Invalid argument): Internal error
>>>
>>> which is a safety check to prevent resuming the guest when the CPUID data 
>>> has
>>> been truncated.  The problem is caused by shrinking of the max policies, 
>>> which
>>> is an ABI that needs handling compatibly between different versions of Xen.
>>>
>>> Furthermore, shrinking of the default policies also breaks things in some
>>> cases, because certain cpuid= settings in a VM config file which used to 
>>> have
>>> an effect will now be silently discarded.
>>>
>>> This reverts commit 540d911c2813c3d8f4cdbb3f5672119e5e768a3d, as well as the
>>> partial fix attempt in 81da2b544cbb003a5447c9b14d275746ad22ab37 (which added
>>> one new case where cpuid= settings might not apply correctly) and restores 
>>> the
>>> same behaviour as Xen 4.15.
>>>
>>> Fixes: 540d911c2813 ("x86/CPUID: shrink max_{,sub}leaf fields according to 
>>> actual leaf contents")
>>> Fixes: 81da2b544cbb ("x86/cpuid: prevent shrinking migrated policies max 
>>> leaves")
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> While not strictly needed with Roger having given his already,
>> Acked-by: Jan Beulich <jbeulich@xxxxxxxx>
>> to signal my (basic) agreement with the course of action taken.
>> Nevertheless I fear this is going to become yet one more case where
>> future action is promised, but things then die out.
> I'm certainly happy to look at newer versions of this patch, but I
> think we should consider doing the shrinking only on the toolstack
> said, and only after all the manipulations on the policy have been
> performed.

Correct.

The max policies cannot be shrunk - they are, by definition, the upper
bounds that we audit against.  (More precisely, they must never end up
lower than an older Xen used to offer on the same configuration, and
must not be lower anything the user may opt in to.)

The default policies can in principle be shrunk, but only if the
toolstack learns to grow max leaf too (which it will need to). 
Nevertheless, actually shrinking the default policies is actively
unhelpful, because it is wasting time doing something which the
toolstack needs to undo later.

The policy for new domains should be shrunk, but only after every other
adjustment is made.  This is one small aspect of teaching the toolstack
to properly understand CPUID (and MSR) policies, and has always been on
the plan.

~Andrew



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.