[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks



>>> On 02.09.16 at 15:15, <feng.wu@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> Sent: Friday, September 2, 2016 6:46 PM
>> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> devel@xxxxxxxxxxxxx 
>> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
>> PI hooks
>> 
>> >>> On 02.09.16 at 12:30, <feng.wu@xxxxxxxxx> wrote:
>> 
>> >
>> >> -----Original Message-----
>> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> Sent: Friday, September 2, 2016 5:26 PM
>> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> >> devel@xxxxxxxxxxxxx 
>> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> assigning
>> >> PI hooks
>> >>
>> >> >>> On 02.09.16 at 10:40, <feng.wu@xxxxxxxxx> wrote:
>> >>
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> Sent: Friday, September 2, 2016 4:16 PM
>> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
>> >> >> devel@xxxxxxxxxxxxx 
>> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> assigning
>> >> >> PI hooks
>> >> >>
>> >> >> >>> On 02.09.16 at 09:31, <feng.wu@xxxxxxxxx> wrote:
>> >> >>
>> >> >> >
>> >> >> >> -----Original Message-----
>> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
>> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>;
>> xen-
>> >> >> >> devel@xxxxxxxxxxxxx 
>> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
>> >> >> assigning
>> >> >> >> PI hooks
>> >> >> >>
>> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@xxxxxxxxx> wrote:
>> >> >> >>
>> >> >> >> >
>> >> >> >> >> -----Original Message-----
>> >> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
>> >> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
>> >> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> >> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>;
>> >> xen-
>> >> >> >> >> devel@xxxxxxxxxxxxx 
>> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
>> before/after
>> >> >> >> assigning
>> >> >> >> >> PI hooks
>> >> >> >> >>
>> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@xxxxxxxxx> wrote:
>> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
>> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct domain
>> *d)
>> >> >> >> >> >
>> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
>> >> >> >> >> >
>> >> >> >> >> > +    /*
>> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
>> >> >> >> >> > +     * running and hence calling the hooks simultaneously
>> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
>> >> >> >> >> > +     * all the appropriate state of PI descriptor is actually
>> >> >> >> >> > +     * set up for all vCPus before leaving this function.
>> >> >> >> >> > +     */
>> >> >> >> >> > +    domain_pause(d);
>> >> >> >> >> > +
>> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
>> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume = vmx_pi_do_resume;
>> >> >> >> >> > +
>> >> >> >> >> > +    domain_unpause(d);
>> >> >> >> >> >  }
>> >> >> >> >>
>> >> >> >> >> First of all I'm missing a word on whether the race mentioned in
>> >> >> >> >> the description and comment can actually happen. Device
>> >> >> >> >> (de)assignment should already be pretty much serialized (via
>> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
>> >> >> >> >
>> >> >> >> > The purpose of this patch is to address the race condition that
>> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do you
>> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
>> >> >> >> > described at:
>> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> >> > Consider that the other two hooks were installed when the VM
>> >> >> >> > is created, seems no such race condition. However, according
>> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
>> >> >> >> > to install the other two hooks here as well,
>> >> >> >>
>> >> >> >> I don't think we've agreed that the creation time installation of
>> >> >> >> those hooks is actually necessary. In fact your most recent
>> >> >> >> response to patch 1 makes me think you now agree we don't
>> >> >> >> need to do so. And hence with that precondition not holding
>> >> >> >> anymore, I don't think the conclusion does.
>> >> >> >
>> >> >> > I think there might be some confusion. Let me explain what I
>> >> >> > am think of to make sure we are on the same page:
>> >> >> > 1. We need install all the four hooks when the first device is
>> >> >> > assigned.
>> >> >> > 2. If _1_ is true, the issue described in
>> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229 
>> >> >> > exists.
>> >> >>
>> >> >> If you mean this
>> >> >>
>> >> >> * vcpu 0 starts running on a pcpu
>> >> >> * a device is assigned, causing the hooks to be set
>> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
>> >> >> actually delivered properly, since ndst is not pointing to the right
>> >> >> processor.
>> >> >>
>> >> >> raised by George, then I'm not convinced it can happen (after all, the
>> >> >> hooks get set _before_ the device gets assigned, and hence before
>> >> >> the device can raise an interrupt destined at the guest). And if it can
>> >> >> happen, then rather than pausing the guest I don't see why, along
>> >> >> with setting the hooks, any possibly affected NDST field can't be
>> >> >> programmed correctly. ISTR having recommended something like
>> >> >> this already during review of the series originally introducing PI.
>> >> >
>> >> > Actually here is the scenario I am concerned about:
>> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
>> >> > and then vmx_vcpu_block().
>> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
>> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
>> >> > current processor yet.
>> >> >
>> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
>> >> > guys don't like it. So here I use this new method in v3 to make sure
>> >> > the vCPU is running while we are installing the hooks.
>> >>
>> >> Indeed, deleting the assertion doesn't seem right. But then why
>> >> can't vmx_vcpu_block() bail early when the domain has no devices
>> >> assigned? That would allow for
>> >>
>> >> 1) set blocking hook
>> >> 2) set up PI state
>> >> 3) actually assign device
>> >
>> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
>> > false, we return early? But has_arch_pdev() needs hold
>> > _pcidevs_lock, right?
>> 
>> I don't think you strictly need to: list_empty() will reliably return
>> true in the case of interest here. And possible races when the last
>> device gets removed are - afaics - benign (i.e. it doesn't matter
>> what it returns at that point in time).
> 
> I remind of another case:
> 1. The four hooks are installed.
> 2. vmx_vcpu_block() gets called before other three hooks gets called,
> even if a device has already been assigned to the domain. We may still
> hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
> the other hooks.

I don't understand: Step 2) in what I've outline above would make
sure NDST is set correctly. Perhaps one should even reverse 2) and
1).

> And that is another reason I use pause/unpause here, it can address
> all the races. And this is an one-time action (Only occurs at the first
> device gets assigned), do you think it is that unacceptable?

No, I've never said it's unacceptable. I just want such to not be
added without good reason.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.