[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks



>>> On 02.09.16 at 12:30, <feng.wu@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Friday, September 2, 2016 5:26 PM
>> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> devel@xxxxxxxxxxxxx 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 10:40, <feng.wu@xxxxxxxxx> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> Sent: Friday, September 2, 2016 4:16 PM
>> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> >> devel@xxxxxxxxxxxxx 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 09:31, <feng.wu@xxxxxxxxx> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> >> >> devel@xxxxxxxxxxxxx 
>> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 02.09.16 at 03:46, <feng.wu@xxxxxxxxx> wrote:
>> >> >>
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>;
>> xen-
>> >> >> >> devel@xxxxxxxxxxxxx 
>> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> >> assigning
>> >> >> >> PI hooks
>> >> >> >>
>> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@xxxxxxxxx> wrote:
>> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain *d)
>> >> >> >> >
>> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >> >
>> >> >> >> > +    /*
>> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> >> > +     */
>> >> >> >> > +    domain_pause(d);
>> >> >> >> > +
>> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> >> >> > +
>> >> >> >> > +    domain_unpause(d);
>> >> >> >> >  }
>> >> >> >>
>> >> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> >> the description and comment can actually happen. Device
>> >> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >> >
>> >> >> > The purpose of this patch is to address the race condition that
>> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> >> > described at:
>> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> > Consider that the other two hooks were installed when the VM
>> >> >> > is created, seems no such race condition. However, according
>> >> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> >> > to install the other two hooks here as well,
>> >> >>
>> >> >> I don't think we've agreed that the creation time installation of
>> >> >> those hooks is actually necessary. In fact your most recent
>> >> >> response to patch 1 makes me think you now agree we don't
>> >> >> need to do so. And hence with that precondition not holding
>> >> >> anymore, I don't think the conclusion does.
>> >> >
>> >> > I think there might be some confusion. Let me explain what I
>> >> > am think of to make sure we are on the same page:
>> >> > 1. We need install all the four hooks when the first device is
>> >> > assigned.
>> >> > 2. If _1_ is true, the issue described in
>> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> > exists.
>> >>
>> >> If you mean this
>> >>
>> >> * vcpu 0 starts running on a pcpu
>> >> * a device is assigned, causing the hooks to be set
>> >> * an interrupt from the device is routed to vcpu 0, but it is not
>> >> actually delivered properly, since ndst is not pointing to the right
>> >> processor.
>> >>
>> >> raised by George, then I'm not convinced it can happen (after all, the
>> >> hooks get set _before_ the device gets assigned, and hence before
>> >> the device can raise an interrupt destined at the guest). And if it can
>> >> happen, then rather than pausing the guest I don't see why, along
>> >> with setting the hooks, any possibly affected NDST field can't be
>> >> programmed correctly. ISTR having recommended something like
>> >> this already during review of the series originally introducing PI.
>> >
>> > Actually here is the scenario I am concerned about:
>> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
>> > and then vmx_vcpu_block().
>> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
>> > we may hit the ASSERT() since 'NDST' may not have been set to the
>> > current processor yet.
>> >
>> > My previous solution in v2 is to delete that ASSERT(), but seems you
>> > guys don't like it. So here I use this new method in v3 to make sure
>> > the vCPU is running while we are installing the hooks.
>> 
>> Indeed, deleting the assertion doesn't seem right. But then why
>> can't vmx_vcpu_block() bail early when the domain has no devices
>> assigned? That would allow for
>> 
>> 1) set blocking hook
>> 2) set up PI state
>> 3) actually assign device
> 
> Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> false, we return early? But has_arch_pdev() needs hold
> _pcidevs_lock, right?

I don't think you strictly need to: list_empty() will reliably return
true in the case of interest here. And possible races when the last
device gets removed are - afaics - benign (i.e. it doesn't matter
what it returns at that point in time).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.