[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization.



On March 09, 2016 6:25pm, <dario.faggioli@xxxxxxxxxx> wrote:
> On Wed, 2016-03-09 at 07:31 +0000, Xu, Quan wrote:
> > On March 09, 2016 1:19pm, <Tian, Kevin> wrote:
> > >
> > > > When iommu_setup() is called in __start_xen(), interrupts have
> > > > already been enabled, and nothing disables (without reenabling)
> > > > them again in the path that leads to calling
> > > > set_iommu_interrupt_handler().
> > > > There's
> > > > therefore no risk of them being disabled and needing remaining so,
> > > > and hence no need for
> > > no risk of them being 'enabled' since no one disables them again?
> > >
> > Yes,
> >
> Reason why one use _irqsave() when locking is because he doesn't know
> whether interrupt are disabled or not, and wants that, whatever the status is
> (enabled or disabled), it remains unchanged upon unlock (which, therefore,
> does the _irqrestore()).
> 
> If we are absolutely sure that they are enabled, i.e., there is no _risk_ 
> that they
> are disabled, it would be ok to just use _irq() (if disabling interrupt is 
> necessary,
> which is not in this case, but that's another thing) and avoid saving the 
> flags.
> 
> That's how I read the original description (which, of course, does not mean it
> can't be simplified).
> 
Dario, thanks for your share.
I appreciate you always share some knowledge. :):)
btw, the "Linux Device Drivers, 3rd Edition" book also describe it clearly, 
http://www.makelinux.net/ldd3/chp-5-sect-5 

> > > I think it should be clear enough to state that pci_get_pdev doesn't
> > > require holding lock with interrupt disabled (so we should use
> > > spin_lock in this AMD case), and add the fact that when this
> > > function is invoked the interrupt is indeed enabled.
> > >
> I totally agree with this description! (Can we use that as changelog? :-) )
> 

Of course. :)

> > > > In order to fix this, we can use just plain spin_lock() and
> > > > spin_unlock(), instead of spin_lock_irqsave() and
> > > > spin_unlock_irqrestore().
> > > What about revise like below?
> > > --
> > >
> > > pcidevs_lock should be held with interrupt enabled.
> > I am not sure for this point.
> >
> Well, it is true that it should. Altough, I think it's more accurate to say 
> that, as
> Kevin says above, it "doesn't require being held with interrupt disabled".
> 
> > > However there remains an
> > > exception in AMD IOMMU code, where the lock is acquired with
> > > interrupt disabled. This inconsistency can lead to deadlock.
> > >
> Can it? In the case of the occurrence being changed by this patch, I don't 
> think it
> can.

Before this patch, it might. As Jan mentioned, that's just a theoretical 
concern: 
 -spin_lock_irqsave() disables interrupts (on the local processor only) before 
taking the spinlock. 
  Supposed in MP system (pCPU1, pCPU2, ...), when the pCPU1 call the 
spin_lock_irqsave(),
  It does only disable interrupts for pCPU1, and _not_ for other pCPUn.
 -Then, it is mixing interrupt disabled and enabled spinlocks.

> 
> > > The fix is straightforward to use spin_lock instead. Also interrupt
> > > has been enabled when this function is invoked, so we're sure
> > > consistency around pcidevs_lock can be guaranteed after this fix.
> > I really appreciate you send out a revised one, but I think it is not
> > only for consistency.
> >  __Wait for Dario's / Jan's opinions__.
> >
> > To be honest, to me, __my_changlog_ is complicated.
> >
> It is complicated. I think it is a detailed, and IMO correct, description of 
> the
> reason why the patch is ok. That is indeed the purpose of a changelog
> (especially for these kind of patches), but it certainly could be
> summarized/simplified a bit.
> 
> I guess I'm also giving it a try, using what Kevin wrote in the middle of his 
> email
> as a basis (with a small addition about consistency at the end).
> 
> "pci_get_pdev() doesn't require interrupts to be disabled when taking the
> pcidevs_lock, which protects its execution. So, spin_lock() can be used for 
> that,
> and that is what is done almost everywhere.
> 
> Currenlty, there is an exception, in early boot IOMMU initialization on AMD
> systems, where spin_lock_irqsave() and restore are used. However, since, in
> that case too:
>  - we don't need to disable interrupts (for same reasons stated above),
>  - interrupts are enabled already, so there is no need to save and
>    restore flags,
> just change it into using spin_lock(), as everywhere.
> 
> Note that, mixing interrupt disabled and enabled spinlocks is something we
> disallow,

Dario, could you share something in detail?
Quan

> except for very special situations. 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.