| 
    
 [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] pci: cleanup MSI interrupts before removing device from IOMMU
 On Wed, Oct 21, 2020 at 01:20:27PM +0200, Jan Beulich wrote:
> On 21.10.2020 10:19, Roger Pau Monne wrote:
> > Doing the MSI cleanup after removing the device from the IOMMU leads
> > to the following panic on AMD hardware:
> > 
> > Assertion 'table.ptr && (index < intremap_table_entries(table.ptr, iommu))' 
> > failed at iommu_intr.c:172
> > ----[ Xen-4.13.1-10.0.3-d  x86_64  debug=y   Not tainted ]----
> > CPU:    3
> > RIP:    e008:[<ffff82d08026ae3c>] 
> > drivers/passthrough/amd/iommu_intr.c#get_intremap_entry+0x52/0x7b
> > [...]
> > Xen call trace:
> >    [<ffff82d08026ae3c>] R 
> > drivers/passthrough/amd/iommu_intr.c#get_intremap_entry+0x52/0x7b
> >    [<ffff82d08026af25>] F 
> > drivers/passthrough/amd/iommu_intr.c#update_intremap_entry_from_msi_msg+0xc0/0x342
> >    [<ffff82d08026ba65>] F amd_iommu_msi_msg_update_ire+0x98/0x129
> >    [<ffff82d08025dd36>] F iommu_update_ire_from_msi+0x1e/0x21
> >    [<ffff82d080286862>] F msi_free_irq+0x55/0x1a0
> >    [<ffff82d080286f25>] F pci_cleanup_msi+0x8c/0xb0
> >    [<ffff82d08025cf52>] F pci_remove_device+0x1af/0x2da
> >    [<ffff82d0802a42d1>] F do_physdev_op+0xd18/0x1187
> >    [<ffff82d080383925>] F pv_hypercall+0x1f5/0x567
> >    [<ffff82d08038a432>] F lstar_enter+0x112/0x120
> > 
> > That's because the call to iommu_remove_device on AMD hardware will
> > remove the per-device interrupt remapping table, and hence the call to
> > pci_cleanup_msi done afterwards will find a null intremap table and
> > crash.
> > 
> > Reorder the calls so that MSI interrupts are torn down before removing
> > the device from the IOMMU.
> 
> I guess this wants
> 
> Fixes: d7cfeb7c13ed ("AMD/IOMMU: don't blindly allocate interrupt remapping 
> tables")
> 
> ?
Oh yes, I didn't git blame the file to figure out when such allocating
and freeing was added.
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> 
> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
Thanks.
> > --- a/xen/drivers/passthrough/pci.c
> > +++ b/xen/drivers/passthrough/pci.c
> > @@ -834,10 +834,15 @@ int pci_remove_device(u16 seg, u8 bus, u8 devfn)
> >      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> >          if ( pdev->bus == bus && pdev->devfn == devfn )
> >          {
> > +            /*
> > +             * Cleanup MSI interrupts before removing the device from the
> > +             * IOMMU, or else the internal IOMMU data used to track the 
> > device
> > +             * interrupts might be already gone.
> > +             */
> > +            pci_cleanup_msi(pdev);
> >              ret = iommu_remove_device(pdev);
> >              if ( pdev->domain )
> >                  list_del(&pdev->domain_list);
> > -            pci_cleanup_msi(pdev);
> 
> To be honest I'm not sure about the comment. It should really have
> been this way from the very beginning, and VT-d not being affected
> makes me wonder what possible improvements are there waiting to be
> noticed and then carried out.
I'm fine with dropping the comment, I would also expect the normal
flow to be to cleanup any interrupt and then remove the device,
instead of the other way around.
Roger.
 
  | 
  
![]()  | 
            
         Lists.xenproject.org is hosted with RackSpace, monitoring our  |