[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks



On Wed, 2016-09-14 at 02:23 +0000, Wu, Feng wrote:
> Then I tried to implement the function like the following:
> 
> /* This function is called when pcidevs_lock is held */
> void vmx_pi_hooks_assign(struct domain *d)
> {
>     if ( !iommu_intpost || !has_hvm_container_domain(d) )
>         return;
> 
>     ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> 
>     /*
>      * We carefully handle the timing here:
>      * - Install the context switch first
>      * - Then set the NDST field
>      * - Install the block and resume hooks in the end
>      *
>      * This can make sure the PI (especially the NDST feild) is
>      * in proper state when we call vmx_vcpu_block().
>      */
>     d->arch.hvm_domain.vmx.pi_switch_from = vmx_pi_switch_from;
>     d->arch.hvm_domain.vmx.pi_switch_to = vmx_pi_switch_to;
> 
>     for_each_vcpu ( d, v )
>     {
>         unsigned int dest = cpu_physical_id(v->processor);
>         struct pi_desc *pi_desc = &v->arch.hvm_vmx.pi_desc;
> 
>         /* spot 1 */
> 
>         write_atomic(&pi_desc->ndst,
>                      x2apic_enabled ? dest : MASK_INSR(dest,
> PI_xAPIC_NDST_MASK));
>     }
> 
>     d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>     d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
> }
> 
> However, I still think it is problematic. Consider the _spot 1_
> above, we get
> the pCPU of the vCPU and update the NDST, but the vCPU can be happily
> rescheduled to another pCPU before updating the NDST. The problem is
> that it is not safe to update NDST here, since vCPU can be scheduled
> behind us at any time. And if this is the case, it is hard to safely
> do this
> without guarantee the vCPU is in a known state. Any further advice
> on this? Thanks a lot!
> 
So, I'm sorry if this is me missing or not remembering something... but
I do remember that, at some point, in the early days of this series,
there were concerns about the use of v->processor behind the back of
the scheduler, i.e., without holding the proper scheduler related
locks.

Pausing the domain was not even being remotely considered at the time,
it (again, at least AFAICR) came up later for trying to address other
issues.

No, the whole point of why that was not a problem in the first place is
that what counts here is on the wait list of what pcpu the vcpu is put,
not really where the vcpu is being or has been scheduled last. Of
course it'd be better --and it would also be true most of the times--
if there where a match, but that was not a correctness concern.

So why this is all of the sudden becoming one? Am I completely off with
my recollection (or in general :-P)? Or what am I missing about the
issue we are trying to address with this new bits of the work?

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.