Xen project Mailing List

Re: [Xen-devel] [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall

On 20/11/15 09:15, Paul Durrant wrote: >> -----Original Message----- >> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx] >> Sent: 19 November 2015 17:09 >> To: Paul Durrant; xen-devel@xxxxxxxxxxxxxxxxxxxx >> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); >> Jan >> Beulich >> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall >> >> On 19/11/15 16:57, Paul Durrant wrote: >>>> -----Original Message----- >>>> From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx] >>>> Sent: 19 November 2015 16:07 >>>> To: Paul Durrant; xen-devel@xxxxxxxxxxxxxxxxxxxx >>>> Cc: Ian Jackson; Stefano Stabellini; Ian Campbell; Wei Liu; Keir (Xen.org); >> Jan >>>> Beulich >>>> Subject: Re: [PATCH v3] x86/hvm/viridian: flush remote tlbs by hypercall >>>> >>>> On 19/11/15 13:19, Paul Durrant wrote: >>>>> @@ -561,10 +584,81 @@ int viridian_hypercall(struct cpu_user_regs >> *regs) >>>>> switch ( input.call_code ) >>>>> { >>>>> case HvNotifyLongSpinWait: >>>>> + /* >>>>> + * See Microsoft Hypervisor Top Level Spec. section 18.5.1. >>>>> + */ >>>>> perfc_incr(mshv_call_long_wait); >>>>> do_sched_op(SCHEDOP_yield, guest_handle_from_ptr(NULL, >> void)); >>>>> status = HV_STATUS_SUCCESS; >>>>> break; >>>>> + >>>>> + case HvFlushVirtualAddressSpace: >>>>> + case HvFlushVirtualAddressList: >>>>> + { >>>>> + cpumask_t *pcpu_mask; >>>>> + struct vcpu *v; >>>>> + struct { >>>>> + uint64_t address_space; >>>>> + uint64_t flags; >>>>> + uint64_t vcpu_mask; >>>>> + } input_params; >>>>> + >>>>> + /* >>>>> + * See Microsoft Hypervisor Top Level Spec. sections 12.4.2 >>>>> + * and 12.4.3. >>>>> + */ >>>>> + perfc_incr(mshv_flush); >>>>> + >>>>> + /* These hypercalls should never use the fast-call convention. */ >>>>> + status = HV_STATUS_INVALID_PARAMETER; >>>>> + if ( input.fast ) >>>>> + break; >>>>> + >>>>> + /* Get input parameters. */ >>>>> + if ( hvm_copy_from_guest_phys(&input_params, >> input_params_gpa, >>>>> + sizeof(input_params)) != >>>>> HVMCOPY_okay ) >>>>> + break; >>>>> + >>>>> + /* >>>>> + * It is not clear from the spec. if we are supposed to >>>>> + * include current virtual CPU in the set or not in this case, >>>>> + * so err on the safe side. >>>>> + */ >>>>> + if ( input_params.flags & HV_FLUSH_ALL_PROCESSORS ) >>>>> + input_params.vcpu_mask = ~0ul; >>>>> + >>>>> + pcpu_mask = curr->arch.hvm_vcpu.viridian.flush_cpumask; >>>>> + cpumask_clear(pcpu_mask); >>>>> + >>>>> + /* >>>>> + * For each specified virtual CPU flush all ASIDs to invalidate >>>>> + * TLB entries the next time it is scheduled and then, if it >>>>> + * is currently running, add its physical CPU to a mask of >>>>> + * those which need to be interrupted to force a flush. >>>>> + */ >>>>> + for_each_vcpu ( currd, v ) >>>>> + { >>>>> + if ( !(input_params.vcpu_mask & (1ul << v->vcpu_id)) ) >>>>> + continue; >>>> You need to cap this loop at a vcpu_id of 63, or the above conditional >>>> will become undefined. >>> The guest should not issue the hypercall if it has more than 64 vCPUs so to >> some extend I don't care what happens, as long as it is not harmful to the >> hypervisor in general, and I don't think that it is in this case. >> >> The compiler is free to do anything it wishes when encountering >> undefined behaviour, including crash the hypervisor. >> > I don't believe so. See section 6.5.7 of the C99 spec. (top pf page 85): > > "The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are > filled with > zeros. If E1 has an unsigned type, the value of the result is E1 x 2^E2, > reduced modulo > one more than the maximum value representable in the result type. If E1 has a > signed > type and nonnegative value, and E1 x 2^E2 is representable in the result > type, then that is > the resulting value; otherwise, the behavior is undefined" > > So, the undefined behaviour you refer to is only in the case that E1 is > signed and in this case it is most definitely unsigned, so the modulo rule > applies. Ok. > >> Any undefined-behaviour which can be triggered by a guest action >> warrants an XSA, because there is no telling what might happen. >> >>>> It might also be wise to fail the vcpu_initialise() for a >>>> viridian-enabled domain having more than 64 vcpus. >>>> >>>>> + >>>>> + hvm_asid_flush_vcpu(v); >>>>> + if ( v->is_running ) >>>>> + cpumask_set_cpu(v->processor, pcpu_mask); >>>> __cpumask_set_cpu(). No need for atomic operations here. >>>> >>> Ok. >>> >>>>> + } >>>>> + >>>>> + /* >>>>> + * Since ASIDs have now been flushed it just remains to >>>>> + * force any CPUs currently running target vCPUs out of non- >>>>> + * root mode. It's possible that re-scheduling has taken place >>>>> + * so we may unnecessarily IPI some CPUs. >>>>> + */ >>>>> + if ( !cpumask_empty(pcpu_mask) ) >>>>> + flush_tlb_mask(pcpu_mask); >>>> Wouldn't it be easier to simply and input_params.vcpu_mask with >>>> d->vcpu_dirty_mask ? >>>> > Actually I realise your original statement makes no sense anyway. There is no > such mask as d->vcpu_dirty_mask.There is d->domain_dirty_cpumask which is a > mask of *physical* CPUs, but since (as the name implies) input_params. > vcpu_mask is a mask of *virtual* CPUs then ANDing the two together would just > yield garbage. > >>> No, that may yield much too big a mask. All we need here is a mask of >> where the vcpus are running *now*, not everywhere they've been. >> >> The dirty mask is a "currently scheduled on" mask. > No it's not. The comment in sched.h clearly states that domain_dirty_cpumask > is a "Bitmask of CPUs which are holding onto this domain's state" which is, > as I said before, essentially everywhere the domains vcpus have been > scheduled since the last time state was flushed. Since, in this case, I have > already invalidated ASIDs for all targeted virtual CPUs I don't need to IPI > that many physical CPUs, I only need the mask of where the virtual CPUs are > *currently* running. If one of the them gets descheduled before the IPI then > the IPI was unnecessary (but there is no low-cost way of determining or > preventing that). In the case of a FlushAll, the domain dirty mask would be fine, as you don't care about a subset of vcpus. Having said that, it is likely not worth doing unless we can optimise away the loop entirely. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.