[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [BUG] After upgrade to Xen 4.12.0 iommu=no-igfx



On Thu, Aug 1, 2019 at 1:16 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>
> On Wed, Jul 31, 2019 at 02:03:24PM -0700, Roman Shaposhnik wrote:
> > On Wed, Jul 31, 2019 at 12:46 PM Andrew Cooper
> > <andrew.cooper3@xxxxxxxxxx> wrote:
> > >
> > > On 31/07/2019 20:35, Roman Shaposhnik wrote:
> > > > On Wed, Jul 31, 2019 at 1:43 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > > > wrote:
> > > >> On Wed, Jul 31, 2019 at 10:36:31AM +0200, Roger Pau Monné wrote:
> > > >>> On Tue, Jul 30, 2019 at 10:55:24AM -0700, Roman Shaposhnik wrote:
> > > >>>> Sorry -- got a bit distracted yesterday. Attached is the log with 
> > > >>>> only
> > > >>>> your latest patch attached. Interestingly enough the box booted fine
> > > >>>> without screen artifacts. So I guess we're getting closer...
> > > >>>>
> > > >>>> Thanks for all the help!
> > > >>> That's quite weird, there's no functional changes between the
> > > >>> previous patches and this one, the only difference is that this patch
> > > >>> has more verbose output.
> > > >>>
> > > >>> Are you sure you didn't have any local patches on top of Xen that
> > > >>> could explain this difference in behaviour?
> > > >> FWIW, can you please try the plain patch again:
> > > >>
> > > >> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg01547.html
> > > >>
> > > >> And report back?
> > > >>
> > > >> I would like to get this committed ASAP if it does fix your issue.
> > > > I'd like to say that it did -- but I tried it again just now and it
> > > > still garbled screen and tons of:
> > > >
> > > > (XEN) printk: 26665 messages suppressed.
> > > > (XEN) [VT-D]DMAR:[DMA Read] Request device [0000:00:02.0] fault addr
> > > > 8e14c000, iommu reg = ffff82c0008de000
> > > >
> > > > I'm very much confused by what's going on, but it seems that's the
> > > > case -- adding those debug print statements make the issue go away
> > > >
> > > > Here are the patches that are being applied:
> > > >    NOT WORKING:
> > > > https://github.com/rvs/eve/blob/xen-bug/pkg/xen/01-iommu-mappings.patch
> > > >
> > > >    WORKING: 
> > > > https://github.com/rvs/eve/blob/a1291fcd4e669df2a63285afb5e8b4841f45c1c8/pkg/xen/01-iommu-mappings.patch
> > > >
> > > > At this point I'm really not sure what's going on.
> > >
> > > Ok.  seeing as you've double checked this, the mystery deepens.
> > >
> > > My bet is on the intel_iommu_lookup_page() call having side effects[1].
> > > If you take out the debugging in the middle of the loop in
> > > rmrr_identity_mapping(), does the problem reproduce again?
> > >
> > > ~Andrew
> > >
> > > [1] Looking at the internals of addr_to_dma_page_maddr(), it does 100%
> > > more memory allocation and higher-level PTE construction than looks wise
> > > for what is supposed to be a getter.
> >
> > Yup. That's what it is -- intel_iommu_lookup_page() seems to be the culprit.
> >
> > I've did the experiment in the other direction -- adding a dummy call:
> >      
> > https://github.com/rvs/eve/blob/36aeeaa7c0a53474fb1ecef2ff587a86637df858/pkg/xen/01-iommu-mappings.patch#L23
> > on top of original Roger's patch makes system boot NORMALLY.
>
> I'm again quite lost, and I don't really understand why mappings added
> by arch_iommu_hwdom_init seems to work fine while mappings added by
> rmrr_identity_mapping don't.
>
> I have yet another patch for you to try, which attempts to mimic
> exactly what arch_iommu_hwdom_init does into rmrr_identity_mapping,
> can you please give it a try?
>
> This has the added bonus of limiting the use of
> {set/clear}_identity_p2m_entry to translated domains only, since
> rmrr_identity_mapping was the only caller against PV domains.
>
> Thanks, Roger.
> ---8<---
> diff --git a/xen/arch/x86/mm/p2m.c b/xen/arch/x86/mm/p2m.c
> index fef97c82f6..d36a58b1a6 100644
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1341,10 +1341,8 @@ int set_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l,
>
>      if ( !paging_mode_translate(p2m->domain) )
>      {
> -        if ( !need_iommu_pt_sync(d) )
> -            return 0;
> -        return iommu_legacy_map(d, _dfn(gfn_l), _mfn(gfn_l), PAGE_ORDER_4K,
> -                                IOMMUF_readable | IOMMUF_writable);
> +        ASSERT_UNREACHABLE();
> +        return -ENXIO;
>      }
>
>      gfn_lock(p2m, gfn, 0);
> @@ -1432,9 +1430,8 @@ int clear_identity_p2m_entry(struct domain *d, unsigned 
> long gfn_l)
>
>      if ( !paging_mode_translate(d) )
>      {
> -        if ( !need_iommu_pt_sync(d) )
> -            return 0;
> -        return iommu_legacy_unmap(d, _dfn(gfn_l), PAGE_ORDER_4K);
> +        ASSERT_UNREACHABLE();
> +        return -ENXIO;
>      }
>
>      gfn_lock(p2m, gfn, 0);
> diff --git a/xen/drivers/passthrough/vtd/iommu.c 
> b/xen/drivers/passthrough/vtd/iommu.c
> index 5d72270c5b..62df5ca5aa 100644
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -1969,6 +1969,7 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>      unsigned long end_pfn = PAGE_ALIGN_4K(rmrr->end_address) >> 
> PAGE_SHIFT_4K;
>      struct mapped_rmrr *mrmrr;
>      struct domain_iommu *hd = dom_iommu(d);
> +    unsigned int flush_flags = 0;
>
>      ASSERT(pcidevs_locked());
>      ASSERT(rmrr->base_address < rmrr->end_address);
> @@ -1982,7 +1983,7 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>          if ( mrmrr->base == rmrr->base_address &&
>               mrmrr->end == rmrr->end_address )
>          {
> -            int ret = 0;
> +            int ret = 0, err;
>
>              if ( map )
>              {
> @@ -1995,13 +1996,20 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>
>              while ( base_pfn < end_pfn )
>              {
> -                if ( clear_identity_p2m_entry(d, base_pfn) )
> -                    ret = -ENXIO;
> +                if ( paging_mode_translate(d) )
> +                    ret = clear_identity_p2m_entry(d, base_pfn);
> +                else
> +                    ret = iommu_unmap(d, _dfn(base_pfn), PAGE_ORDER_4K,
> +                                      &flush_flags);
>                  base_pfn++;
>              }
>
>              list_del(&mrmrr->list);
>              xfree(mrmrr);
> +            /* Keep the previous error code if there's one. */
> +            err = iommu_iotlb_flush_all(d, flush_flags);
> +            if ( !ret )
> +                ret = err;
>              return ret;
>          }
>      }
> @@ -2011,8 +2019,13 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>
>      while ( base_pfn < end_pfn )
>      {
> -        int err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> +        int err;
>
> +        if ( paging_mode_translate(d) )
> +            err = set_identity_p2m_entry(d, base_pfn, p2m_access_rw, flag);
> +        else
> +            err = iommu_map(d, _dfn(base_pfn), _mfn(base_pfn), PAGE_ORDER_4K,
> +                            IOMMUF_readable | IOMMUF_writable, &flush_flags);
>          if ( err )
>              return err;
>          base_pfn++;
> @@ -2026,7 +2039,7 @@ static int rmrr_identity_mapping(struct domain *d, 
> bool_t map,
>      mrmrr->count = 1;
>      list_add_tail(&mrmrr->list, &hd->arch.mapped_rmrrs);
>
> -    return 0;
> +    return iommu_iotlb_flush_all(d, flush_flags);
>  }
>
>  static int intel_iommu_add_device(u8 devfn, struct pci_dev *pdev)

This patch completely fixes the problem for me!

Thanks Roger! I'd love to see this in Xen 4.13

Thanks,
Roman.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.