Xen project Mailing List

Re: [PATCH RFC for-4.15] x86/msr: introduce an option for legacy MSR behavior selection

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Thu, 4 Mar 2021 09:48:25 +0100

Cc: Wei Liu <wl@xxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Ian Jackson <iwj@xxxxxxxxxxxxxx>

Delivery-date: Thu, 04 Mar 2021 08:48:32 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 03.03.2021 16:38, Roger Pau Monné wrote: > On Tue, Mar 02, 2021 at 04:18:59PM +0100, Jan Beulich wrote: >> On 02.03.2021 16:00, Roger Pau Monné wrote: >>> On Tue, Mar 02, 2021 at 12:16:12PM +0100, Jan Beulich wrote: >>>> On 01.03.2021 17:23, Roger Pau Monne wrote: >>>>> RFC because there's still some debate as to how we should solve the >>>>> MSR issue, this is one possible way, but IMO we need to make a >>>>> decision soon-ish because of the release timeline. >>>> >>>> Generally I think it would be far better from a user pov if >>>> things worked out of the box, at least in cases where it can be >>>> made work. Arguably for Solaris this isn't going to be possible >>>> (leaving aside the non-option of fully returning back to original >>>> behavior), but for the old-Linux-PV case I still think my proposed >>>> change is better in this regard. I realize Andrew said (on irc) >>>> that this is making the behavior even weaker. I take a different >>>> perspective: Considering a guest will install exception handlers >>>> at some point, I continue to fail to see what good it will do to >>>> try to inject a #GP(0) when we know the guest will die because of >>>> not having a handler registered just yet. It is a kernel flaw, >>>> yes, but they ended up living with it merely because of our odd >>>> prior behavior, so we can't put all the blame on them. >>> >>> My concern with this would mostly be with newer guests, in the sense >>> that people could come to rely on this behavior of not injecting a >>> #GP if the handler is not setup, which I think we should try to avoid. >>> >>> One option would be to introduce a feature that could be used in the >>> elfnotes to signal that the kernel doesn't require such workarounds >>> for MSR #GP handling, maybe: >>> >>> /* >>> * Signal that the kernel doesn't require suppressing the #GP on >>> * unknown MSR accesses if the handler is not setup. New kernels >>> * should always have this set. >>> */ >>> #define XENFEAT_msr_early_gp 16 >>> >>> We could try to backport this on the Linux kernel as far as we can >>> that we know it's safe to do so. >> >> I too did consider something like this. While I'm not opposed, it >> effectively requires well-behaved consumers who haven't been well- >> behaved in the past. >> >>> If this is not acceptable then I guess your solution is fine. Arguably >>> PV has it's own (weird) architecture, in which it might be safe to say >>> that no #GP will be injected as a result of a MSR access unless the >>> handler is setup. Ideally we should document this behaviour somewhere. >>> >>> Maybe we could add a rdmsr_safe to your path akin to what's done >>> here? >> >> Probably. Would need trying out on the affected older version, >> just like ... >> >>>>> --- a/docs/man/xl.cfg.5.pod.in >>>>> +++ b/docs/man/xl.cfg.5.pod.in >>>>> @@ -2861,6 +2861,17 @@ No MCA capabilities in above list are enabled. >>>>> >>>>> =back >>>>> >>>>> +=item B<msr_legacy_handling=BOOLEAN> >>>>> + >>>>> +Select whether to use the legacy behaviour for accesses to MSRs not >>>>> explicitly >>>>> +handled by Xen instead of injecting a #GP to the guest. Such legacy >>>>> access >>>>> +mode will force Xen to replicate the behaviour from the hardware it's >>>>> currently >>>>> +running on in order to decide whether a #GP is injected to the guest. >>>>> Note >>>>> +that the guest is never allowed access to unhandled MSRs, this option >>>>> only >>>>> +changes whether a #GP might be injected or not. >>>> >>>> This description is appropriate for reads, but not for writes: >>>> Whether a write would fault can only be known by trying a write, >>>> yet for safety reasons we can't risk doing more than a read. An >>>> option might be to make writes fault is the to be written value >>>> differs from that which the probing read has returned (i.e. the >>>> same condition which originally had caused a log message to appear >>>> in 4.14 for PV guests). >>> >>> Read values for unhandled MSRs will always be 0 with the proposed >>> code, so we would inject a #GP to the guest for any write attempt to >>> unhandled MSRs with a value different than 0. >>> >>> Maybe we should just inject a #GP to the guest for any attempts to >>> write to unhandled MSRs? >> >> ... doing this would need to. Iirc I did add the write side of the >> handling in my patch just for symmetry. I'd prefer handling to be >> symmetric, but I can see why we may not want it to be so. > > Kind of in the wrong context, but I will reply here because it's the > last message related to the MSR stuff. > > Jan: would you be fine with modifying your path to not change the > behaviour for writes (ie: always inject #GP to the guest for unhandled > accesses), and then add a rdmsr_safe to the read path in order to > decide whether to inject a #GP to the guest even if the #GP handler is > not setup? I don't mind as long as it ends up making things work (I have no reason to believe it won't). I'll give that a try. > I can modify the option introduced on this patch to always inject #GP > on unhandled writes and only inject #GP on reads if the physical MSR > access on the hardware also triggers a #GP. HVM guests with broken > behavior will require having the option enabled in order to work, > but I think that's likely OK as long term we expect all HVM guests to > be well behaved. > > My main worry with this approach is that we end up requiring half of > the common HVM guests OSes to have the 'legacy MSR handling' option > enabled in order to work. I think it's unlikely for this to happen, as > we are only aware of Solaris requiring it at the moment. > > It also raises the question whether we will allow any more exceptions > to the MSR policy, like we did for Windows and OpenBSD in: > > http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=ca88a43e660c75796656a544e54a648c60d26ef0 > http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=4175fd3ccd17face664036fa98e9329aa317f7b6 > > Or if we are just going to require those guests to enable the legacy > MSR handling instead. It is my understanding that Andrew's view is that adding such special cases is the only acceptable thing. As voiced before I don't share this view, as it means we're going to be entirely reactive to people reporting issues, when I think we should be pro-active as far as is reasonable. Independent of any pro-active measures there'll still be enough reasons for reactive changes - for example I assume Linux would now issue the log message from if (cpu_has(c, X86_FEATURE_CONSTANT_TSC)) { if (c->x86 > 0x10 || (c->x86 == 0x10 && c->x86_model >= 0x2)) { u64 val; rdmsrl(MSR_K7_HWCR, val); if (!(val & BIT(24))) pr_warn(FW_BUG "TSC doesn't count with P0 frequency!\n"); } } since we surface a zero value right now (but I didn't verify this in practice yet). Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.