Xen project Mailing List

Re: [Xen-devel] [PATCH v9 09/11] x86/ctxt: Issue a speculation barrier between vcpu contexts

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Fri, 19 Jan 2018 14:48:22 +0000

Cc: David Woodhouse <dwmw@xxxxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxx>

Delivery-date: Fri, 19 Jan 2018 14:48:47 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 19/01/18 13:25, Jan Beulich wrote: >>>> On 18.01.18 at 16:46, <andrew.cooper3@xxxxxxxxxx> wrote: >> --- a/xen/arch/x86/domain.c >> +++ b/xen/arch/x86/domain.c >> @@ -1743,6 +1743,29 @@ void context_switch(struct vcpu *prev, struct vcpu >> *next) >> } >> >> ctxt_switch_levelling(next); >> + >> + if ( opt_ibpb ) >> + { >> + static DEFINE_PER_CPU(unsigned int, last_nonidle); >> + unsigned int *last_id = &this_cpu(last_nonidle); > "nonidle" is not entirely correct without an is_idle_...() check around > it, as the outer condition leaves room for idle vCPU-s to make it here. > But take this as a remark, not a strict request to change the name. If you can suggest a better name, I'll use it. > >> + /* Squash the domid and vcpu id together for efficiency. */ >> + unsigned int next_id = (((unsigned int)nextd->domain_id << 16) | >> + (uint16_t)next->vcpu_id); > Is this really more efficient than just comparing struct vcpu pointers? vcpu pointers are rather more susceptible to false aliasing in the case that the 4k memory allocation behind struct vcpu gets reused. The risks are admittedly minute, but this is a much safer option. > >> + BUILD_BUG_ON(MAX_VIRT_CPUS > 0xffff); >> + >> + /* >> + * When scheduling from a vcpu, to idle, and back to the same >> vcpu >> + * (which might be common in a lightly loaded system, or when >> + * using vcpu pinning), there is no need to issue IBPB, as we >> are >> + * returning to the same security context. >> + */ >> + if ( *last_id != next_id ) >> + { >> + wrmsrl(MSR_PRED_CMD, PRED_CMD_IBPB); >> + *last_id = next_id; >> + } > I've read David's mails regarding this another time, but I still can't > conclude why this extra logic would be necessary. Transitions > from a guest vCPU through idle and back to that very vCPU do > not alter per_cpu(curr_vcpu, ...) - __context_switch() is the > only place to update it. There's certainly the potential for it to > be called through __sync_local_execstate(), but is that a > common case? I'd support introduction of the extra logic only if > so. > > Furthermore, if this indeed was a sufficiently common case, doing > lazy context switches at all for HVM guests would once again > need to be put under question. David found that transitions to idle and back to the same vcpu were reliably taking an unnecessary IBPB. I'm deeply suspicious that, given a TLB shootdown IPI interrupting half-idle will switch to full idle, lazy context switching is actually a win (as half-idle still remains set in a domains dirty mask). Andy Luto dropped lazyfpu from linux a while ago, proving that was worse in most cases even for regular processes. I certainly don't think that trapping #NM for HVM vcpu lazy fpu is clever on any hardware. Just one of the many things to investigate in copious free time. > >> --- a/xen/arch/x86/spec_ctrl.c >> +++ b/xen/arch/x86/spec_ctrl.c >> @@ -33,6 +33,7 @@ static enum ind_thunk { >> THUNK_JMP, >> } opt_thunk __initdata = THUNK_DEFAULT; >> static int opt_ibrs __initdata = -1; >> +bool __read_mostly opt_ibpb = true; >> static bool opt_rsb_native __initdata = true, opt_rsb_vmexit __initdata = >> true; > Considering the (better imo) placement of __read_mostly here, > would you mind moving the __initdata accordingly in the earlier > patches (opt_thunk having reasons to be an exception)? > Otherwise please move the __read_mostly to the end. Ok. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.