[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC for-4.15] x86/msr: introduce an option for legacy MSR behavior selection



On 03.03.2021 16:38, Roger Pau Monné wrote:
> On Tue, Mar 02, 2021 at 04:18:59PM +0100, Jan Beulich wrote:
>> On 02.03.2021 16:00, Roger Pau Monné wrote:
>>> On Tue, Mar 02, 2021 at 12:16:12PM +0100, Jan Beulich wrote:
>>>> On 01.03.2021 17:23, Roger Pau Monne wrote:
>>>>> RFC because there's still some debate as to how we should solve the
>>>>> MSR issue, this is one possible way, but IMO we need to make a
>>>>> decision soon-ish because of the release timeline.
>>>>
>>>> Generally I think it would be far better from a user pov if
>>>> things worked out of the box, at least in cases where it can be
>>>> made work. Arguably for Solaris this isn't going to be possible
>>>> (leaving aside the non-option of fully returning back to original
>>>> behavior), but for the old-Linux-PV case I still think my proposed
>>>> change is better in this regard. I realize Andrew said (on irc)
>>>> that this is making the behavior even weaker. I take a different
>>>> perspective: Considering a guest will install exception handlers
>>>> at some point, I continue to fail to see what good it will do to
>>>> try to inject a #GP(0) when we know the guest will die because of
>>>> not having a handler registered just yet. It is a kernel flaw,
>>>> yes, but they ended up living with it merely because of our odd
>>>> prior behavior, so we can't put all the blame on them.
>>>
>>> My concern with this would mostly be with newer guests, in the sense
>>> that people could come to rely on this behavior of not injecting a
>>> #GP if the handler is not setup, which I think we should try to avoid.
>>>
>>> One option would be to introduce a feature that could be used in the
>>> elfnotes to signal that the kernel doesn't require such workarounds
>>> for MSR #GP handling, maybe:
>>>
>>> /*
>>>  * Signal that the kernel doesn't require suppressing the #GP on
>>>  * unknown MSR accesses if the handler is not setup. New kernels
>>>  * should always have this set.
>>>  */
>>> #define XENFEAT_msr_early_gp   16
>>>
>>> We could try to backport this on the Linux kernel as far as we can
>>> that we know it's safe to do so.
>>
>> I too did consider something like this. While I'm not opposed, it
>> effectively requires well-behaved consumers who haven't been well-
>> behaved in the past.
>>
>>> If this is not acceptable then I guess your solution is fine. Arguably
>>> PV has it's own (weird) architecture, in which it might be safe to say
>>> that no #GP will be injected as a result of a MSR access unless the
>>> handler is setup. Ideally we should document this behaviour somewhere.
>>>
>>> Maybe we could add a rdmsr_safe to your path akin to what's done
>>> here?
>>
>> Probably. Would need trying out on the affected older version,
>> just like ...
>>
>>>>> --- a/docs/man/xl.cfg.5.pod.in
>>>>> +++ b/docs/man/xl.cfg.5.pod.in
>>>>> @@ -2861,6 +2861,17 @@ No MCA capabilities in above list are enabled.
>>>>>  
>>>>>  =back
>>>>>  
>>>>> +=item B<msr_legacy_handling=BOOLEAN>
>>>>> +
>>>>> +Select whether to use the legacy behaviour for accesses to MSRs not 
>>>>> explicitly
>>>>> +handled by Xen instead of injecting a #GP to the guest.  Such legacy 
>>>>> access
>>>>> +mode will force Xen to replicate the behaviour from the hardware it's 
>>>>> currently
>>>>> +running on in order to decide whether a #GP is injected to the guest.  
>>>>> Note
>>>>> +that the guest is never allowed access to unhandled MSRs, this option 
>>>>> only
>>>>> +changes whether a #GP might be injected or not.
>>>>
>>>> This description is appropriate for reads, but not for writes:
>>>> Whether a write would fault can only be known by trying a write,
>>>> yet for safety reasons we can't risk doing more than a read. An
>>>> option might be to make writes fault is the to be written value
>>>> differs from that which the probing read has returned (i.e. the
>>>> same condition which originally had caused a log message to appear
>>>> in 4.14 for PV guests).
>>>
>>> Read values for unhandled MSRs will always be 0 with the proposed
>>> code, so we would inject a #GP to the guest for any write attempt to
>>> unhandled MSRs with a value different than 0.
>>>
>>> Maybe we should just inject a #GP to the guest for any attempts to
>>> write to unhandled MSRs?
>>
>> ... doing this would need to. Iirc I did add the write side of the
>> handling in my patch just for symmetry. I'd prefer handling to be
>> symmetric, but I can see why we may not want it to be so.
> 
> Kind of in the wrong context, but I will reply here because it's the
> last message related to the MSR stuff.
> 
> Jan: would you be fine with modifying your path to not change the
> behaviour for writes (ie: always inject #GP to the guest for unhandled
> accesses), and then add a rdmsr_safe to the read path in order to
> decide whether to inject a #GP to the guest even if the #GP handler is
> not setup?

I don't mind as long as it ends up making things work (I have no
reason to believe it won't). I'll give that a try.

> I can modify the option introduced on this patch to always inject #GP
> on unhandled writes and only inject #GP on reads if the physical MSR
> access on the hardware also triggers a #GP. HVM guests with broken
> behavior will require having the option enabled in order to work,
> but I think that's likely OK as long term we expect all HVM guests to
> be well behaved.
> 
> My main worry with this approach is that we end up requiring half of
> the common HVM guests OSes to have the 'legacy MSR handling' option
> enabled in order to work. I think it's unlikely for this to happen, as
> we are only aware of Solaris requiring it at the moment.
> 
> It also raises the question whether we will allow any more exceptions
> to the MSR policy, like we did for Windows and OpenBSD in:
> 
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=ca88a43e660c75796656a544e54a648c60d26ef0
> http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=4175fd3ccd17face664036fa98e9329aa317f7b6
> 
> Or if we are just going to require those guests to enable the legacy
> MSR handling instead.

It is my understanding that Andrew's view is that adding such special
cases is the only acceptable thing. As voiced before I don't share
this view, as it means we're going to be entirely reactive to people
reporting issues, when I think we should be pro-active as far as is
reasonable. Independent of any pro-active measures there'll still be
enough reasons for reactive changes - for example I assume Linux
would now issue the log message from

        if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) {

                if (c->x86 > 0x10 ||
                    (c->x86 == 0x10 && c->x86_model >= 0x2)) {
                        u64 val;

                        rdmsrl(MSR_K7_HWCR, val);
                        if (!(val & BIT(24)))
                                pr_warn(FW_BUG "TSC doesn't count with P0 
frequency!\n");
                }
        }

since we surface a zero value right now (but I didn't verify this in
practice yet).

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.