[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 4/6] Pause/Unpause the domain before/after assigning PI hooks




> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> Sent: Friday, September 2, 2016 9:55 PM
> To: Wu, Feng <feng.wu@xxxxxxxxx>
> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxx
> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after assigning
> PI hooks
> 
> >>> On 02.09.16 at 15:15, <feng.wu@xxxxxxxxx> wrote:
> 
> >
> >> -----Original Message-----
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: Friday, September 2, 2016 6:46 PM
> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
> >> devel@xxxxxxxxxxxxx
> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> assigning
> >> PI hooks
> >>
> >> >>> On 02.09.16 at 12:30, <feng.wu@xxxxxxxxx> wrote:
> >>
> >> >
> >> >> -----Original Message-----
> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> Sent: Friday, September 2, 2016 5:26 PM
> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen-
> >> >> devel@xxxxxxxxxxxxx
> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> assigning
> >> >> PI hooks
> >> >>
> >> >> >>> On 02.09.16 at 10:40, <feng.wu@xxxxxxxxx> wrote:
> >> >>
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> >> Sent: Friday, September 2, 2016 4:16 PM
> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>;
> xen-
> >> >> >> devel@xxxxxxxxxxxxx
> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain before/after
> >> >> assigning
> >> >> >> PI hooks
> >> >> >>
> >> >> >> >>> On 02.09.16 at 09:31, <feng.wu@xxxxxxxxx> wrote:
> >> >> >>
> >> >> >> >
> >> >> >> >> -----Original Message-----
> >> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> >> >> Sent: Friday, September 2, 2016 3:04 PM
> >> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
> >> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> >> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>;
> >> xen-
> >> >> >> >> devel@xxxxxxxxxxxxx
> >> >> >> >> Subject: RE: [PATCH v3 4/6] Pause/Unpause the domain
> before/after
> >> >> >> assigning
> >> >> >> >> PI hooks
> >> >> >> >>
> >> >> >> >> >>> On 02.09.16 at 03:46, <feng.wu@xxxxxxxxx> wrote:
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> >> -----Original Message-----
> >> >> >> >> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> >> >> >> >> Sent: Thursday, September 1, 2016 4:30 PM
> >> >> >> >> >> To: Wu, Feng <feng.wu@xxxxxxxxx>
> >> >> >> >> >> Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
> >> >> >> >> >> george.dunlap@xxxxxxxxxxxxx; Tian, Kevin
> <kevin.tian@xxxxxxxxx>;
> >> >> xen-
> >> >> >> >> >> devel@xxxxxxxxxxxxx
> >> >> >> >> >> Subject: Re: [PATCH v3 4/6] Pause/Unpause the domain
> >> before/after
> >> >> >> >> assigning
> >> >> >> >> >> PI hooks
> >> >> >> >> >>
> >> >> >> >> >> >>> On 31.08.16 at 05:56, <feng.wu@xxxxxxxxx> wrote:
> >> >> >> >> >> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> >> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >> >> >> >> >> > @@ -219,8 +219,19 @@ void vmx_pi_hooks_assign(struct
> domain
> >> *d)
> >> >> >> >> >> >
> >> >> >> >> >> >      ASSERT(!d->arch.hvm_domain.vmx.vcpu_block);
> >> >> >> >> >> >
> >> >> >> >> >> > +    /*
> >> >> >> >> >> > +     * Pausing the domain can make sure the vCPU is not
> >> >> >> >> >> > +     * running and hence calling the hooks simultaneously
> >> >> >> >> >> > +     * when deassigning the PI hooks. This makes sure that
> >> >> >> >> >> > +     * all the appropriate state of PI descriptor is 
> >> >> >> >> >> > actually
> >> >> >> >> >> > +     * set up for all vCPus before leaving this function.
> >> >> >> >> >> > +     */
> >> >> >> >> >> > +    domain_pause(d);
> >> >> >> >> >> > +
> >> >> >> >> >> >      d->arch.hvm_domain.vmx.vcpu_block = vmx_vcpu_block;
> >> >> >> >> >> >      d->arch.hvm_domain.vmx.pi_do_resume =
> vmx_pi_do_resume;
> >> >> >> >> >> > +
> >> >> >> >> >> > +    domain_unpause(d);
> >> >> >> >> >> >  }
> >> >> >> >> >>
> >> >> >> >> >> First of all I'm missing a word on whether the race mentioned 
> >> >> >> >> >> in
> >> >> >> >> >> the description and comment can actually happen. Device
> >> >> >> >> >> (de)assignment should already be pretty much serialized (via
> >> >> >> >> >> the domctl lock, and maybe also via the pcidevs one).
> >> >> >> >> >
> >> >> >> >> > The purpose of this patch is to address the race condition that
> >> >> >> >> > the _vCPU_ is running while we are installing these hooks. Do 
> >> >> >> >> > you
> >> >> >> >> > think this cannot happen?  This patch is trying to fix the issue
> >> >> >> >> > described at:
> >> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> >> >> > Consider that the other two hooks were installed when the VM
> >> >> >> >> > is created, seems no such race condition. However, according
> >> >> >> >> > to the discussion about patch 1 and patch 2 of series, we need
> >> >> >> >> > to install the other two hooks here as well,
> >> >> >> >>
> >> >> >> >> I don't think we've agreed that the creation time installation of
> >> >> >> >> those hooks is actually necessary. In fact your most recent
> >> >> >> >> response to patch 1 makes me think you now agree we don't
> >> >> >> >> need to do so. And hence with that precondition not holding
> >> >> >> >> anymore, I don't think the conclusion does.
> >> >> >> >
> >> >> >> > I think there might be some confusion. Let me explain what I
> >> >> >> > am think of to make sure we are on the same page:
> >> >> >> > 1. We need install all the four hooks when the first device is
> >> >> >> > assigned.
> >> >> >> > 2. If _1_ is true, the issue described in
> >> >> >> > http://www.gossamer-threads.com/lists/xen/devel/433229
> >> >> >> > exists.
> >> >> >>
> >> >> >> If you mean this
> >> >> >>
> >> >> >> * vcpu 0 starts running on a pcpu
> >> >> >> * a device is assigned, causing the hooks to be set
> >> >> >> * an interrupt from the device is routed to vcpu 0, but it is not
> >> >> >> actually delivered properly, since ndst is not pointing to the right
> >> >> >> processor.
> >> >> >>
> >> >> >> raised by George, then I'm not convinced it can happen (after all, 
> >> >> >> the
> >> >> >> hooks get set _before_ the device gets assigned, and hence before
> >> >> >> the device can raise an interrupt destined at the guest). And if it 
> >> >> >> can
> >> >> >> happen, then rather than pausing the guest I don't see why, along
> >> >> >> with setting the hooks, any possibly affected NDST field can't be
> >> >> >> programmed correctly. ISTR having recommended something like
> >> >> >> this already during review of the series originally introducing PI.
> >> >> >
> >> >> > Actually here is the scenario I am concerned about:
> >> >> > 1. ' vmx_vcpu_block ' is installed while vCPU is running vcpu_block()
> >> >> > and then vmx_vcpu_block().
> >> >> > 2. There is a ASSERT() about 'NDST' field in vmx_vcpu_block(), I think
> >> >> > we may hit the ASSERT() since 'NDST' may not have been set to the
> >> >> > current processor yet.
> >> >> >
> >> >> > My previous solution in v2 is to delete that ASSERT(), but seems you
> >> >> > guys don't like it. So here I use this new method in v3 to make sure
> >> >> > the vCPU is running while we are installing the hooks.
> >> >>
> >> >> Indeed, deleting the assertion doesn't seem right. But then why
> >> >> can't vmx_vcpu_block() bail early when the domain has no devices
> >> >> assigned? That would allow for
> >> >>
> >> >> 1) set blocking hook
> >> >> 2) set up PI state
> >> >> 3) actually assign device
> >> >
> >> > Do you mean we check has_arch_pdev() in vmx_vcpu_block(), if it is
> >> > false, we return early? But has_arch_pdev() needs hold
> >> > _pcidevs_lock, right?
> >>
> >> I don't think you strictly need to: list_empty() will reliably return
> >> true in the case of interest here. And possible races when the last
> >> device gets removed are - afaics - benign (i.e. it doesn't matter
> >> what it returns at that point in time).
> >
> > I remind of another case:
> > 1. The four hooks are installed.
> > 2. vmx_vcpu_block() gets called before other three hooks gets called,
> > even if a device has already been assigned to the domain. We may still
> > hit the ASSERT() in vmx_vcpu_block() since 'NDST' field is changed in
> > the other hooks.
> 
> I don't understand: Step 2) in what I've outline above would make
> sure NDST is set correctly. Perhaps one should even reverse 2) and
> 1).

I still have trouble to fully understand this, please see the following
scenario:

1) Set 'NDST' to the pCPU0 (which vCPU is currently running on, but
it may changes since vCPU scheduling happens behind us, so how to
decide which pCPU for 'NDST'?)
2) Install all four hooks and vCPU is re-scheduled before
'vmx_pi_switch_to()' gets installed, so the pCPU of the vCPU might
be changed to pCPU1 without 'NDST' gets updated.
4) vmx_vcpu_block() gets called and we hit the ASSERT()

Maybe I missed something, It would be appreciated if you can
correct me if my understanding is wrong.

Thanks,
Feng

> 
> > And that is another reason I use pause/unpause here, it can address
> > all the races. And this is an one-time action (Only occurs at the first
> > device gets assigned), do you think it is that unacceptable?
> 
> No, I've never said it's unacceptable. I just want such to not be
> added without good reason.
> 
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.