[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v10 09/11] x86/ctxt: Issue a speculation barrier between vcpu contexts
On 25/01/18 15:57, Jan Beulich wrote: >>>> On 24.01.18 at 14:12, <andrew.cooper3@xxxxxxxxxx> wrote: >> @@ -1743,6 +1744,34 @@ void context_switch(struct vcpu *prev, struct vcpu >> *next) >> } >> >> ctxt_switch_levelling(next); >> + >> + if ( opt_ibpb && !is_idle_domain(nextd) ) > Is the idle domain check here really useful? Yes, because as you pointed out in v9, the outer condition isn't sufficient to exclude nextd being idle. > >> + { >> + static DEFINE_PER_CPU(unsigned int, last); >> + unsigned int *last_id = &this_cpu(last); >> + >> + /* >> + * Squash the domid and vcpu id together for comparason >> + * efficiency. We could in principle stash and compare the >> struct >> + * vcpu pointer, but this risks a false alias if a domain has >> died >> + * and the same 4k page gets reused for a new vcpu. >> + */ >> + unsigned int next_id = (((unsigned int)nextd->domain_id << 16) | >> + (uint16_t)next->vcpu_id); >> + BUILD_BUG_ON(MAX_VIRT_CPUS > 0xffff); >> + >> + /* >> + * When scheduling from a vcpu, to idle, and back to the same >> vcpu >> + * (which might be common in a lightly loaded system, or when >> + * using vcpu pinning), there is no need to issue IBPB, as we >> are >> + * returning to the same security context. >> + */ >> + if ( *last_id != next_id ) >> + { >> + wrmsrl(MSR_PRED_CMD, PRED_CMD_IBPB); >> + *last_id = next_id; >> + } >> + } >> } >> >> context_saved(prev); > Short of any real numbers (or a proper explanation) having been > provided, I've done some measurements. Indeed I can see quite > high a rate of cases of execution coming this way despite the > vCPU not really changing during early boot of HVM guests. This > goes down quite a bit later on, but obviously that's also workload > dependent. But the number of cases where the barrier emission > could be avoided remains non-negligible, so I agree the extra > avoidance logic is probably warranted. On that basis (perhaps > with the idle check above removed) > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > > For the record, the overwhelming majority of calls to > __sync_local_execstate() being responsible for the behavior > come from invalidate_interrupt(), which suggests to me that > there's a meaningful number of cases where a vCPU is migrated > to another CPU and then back, without another vCPU having > run on the original CPU in between. If I'm not wrong with this, > I have to question why the vCPU is migrated then in the first > place. This is a known (mis)feature of the Credit scheduler. When a PCPU goes idle, it aggressively steals work from the adjacent cpus, even if the adjacent cpu only has a single vcpu to run and is running it. XenServer has this gross hack which makes aggregate networking measurements far far better https://github.com/xenserver/xen.pg/blob/XS-7.1.x/master/sched-credit1-use-per-pcpu-runqueue-count.patch Dario made a different fix to Credit1 upstream which was supposed to resolve this behaviour (although I can't locate the patch by a list of titles), but based on these observations, I'd say the misfeature hasn't been fixed. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |