[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] xen/arm: fix gnttab_need_iommu_mapping

On Mon, 8 Feb 2021, Julien Grall wrote:
> (+ Jan and Ian for RM/stable decision)
> Hi Jan,
> On 08/02/2021 18:49, Stefano Stabellini wrote:
> > Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
> > The offending chunk is:
> > 
> >   #define gnttab_need_iommu_mapping(d)                    \
> > -    (is_domain_direct_mapped(d) && need_iommu(d))
> > +    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))
> > 
> > On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
> > directly mapped and IOMMU is enabled for the domain, like the old check
> > did, but the new check is always false.
> > 
> > In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
> > need_sync is set as:
> > 
> >      if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> >          hd->need_sync = !iommu_use_hap_pt(d);
> > 
> > iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
> > P2M. It is true on ARM. need_sync means that you have a separate IOMMU
> > page-table and it needs to be updated for every change. need_sync is set
> > to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
> > which is wrong.
> > 
> > As a consequence, when using PV network from a domU on a system where
> > IOMMU is on from Dom0, I get:
> > 
> > (XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402,
> > iova=0x8424cb148, fsynr=0xb0001, cb=0
> > [   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> > 
> > The fix is to go back to something along the lines of the old
> > implementation of gnttab_need_iommu_mapping.
> > 
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
> > Fixes: 91d4eca7add
> The format for fixes tag is:
> Fixes: sha ("<commit title>")

Oops. Can be fixed on commit.

> For staging fix:
> Reviewed-by: Julien Grall <jgrall@xxxxxxxxxx>

Thank you!

> @Ian, I think this wants to go in 4.15. Without it, Xen may receive an IOMMU
> fault for DMA transaction using granted page.
> > Backport: 4.12+
> > 
> > ---
> > 
> > Given the severity of the bug, I would like to request this patch to be
> > backported to 4.12 too, even if 4.12 is security-fixes only since Oct
> > 2020.
> I would agree that the bug is bad, but it is not clear to me why this would be
> warrant for an exception for backporting. Can you outline what's the worse
> that can happen?
> Correct me if I am wrong, if one can hit this error, then it should be pretty
> reliable. Therefore, anyone wanted to use 4.12 in production should have seen
> if the error on there setup by now (4.12 has been out for nearly two years).
> If not, then they are most likely not affected.
> Any new users of Xen should use the latest stable rather than starting with an
> old version.

Yes, the bug reproduces reliably but it takes more than a smoke test to
find it. That's why it wasn't found by OSSTest and also our internal
CI-loop at Xilinx.

Users can be very slow at upgrading, so I am worried that 4.12 might still
be picked by somebody, especially given that it is still security
supported for a while.

> Other than the seriousness of the bug, I think there is also a fairness
> concern.
> So far our rules says there is only security support backport allowed. If we
> start granting exception, then we need a way to prevent abuse of it. To take
> an extreme example, why one couldn't ask backport for 4.2?
> That said, I vaguely remember this topic was brought up a few time on
> security@. So maybe it is time to have a new discussion about stable tree.

I wouldn't consider a backport for a tree that is closed even for
security backports. So in your example, I'd say no to a backport to 4.2
or 4.10.

I think there is a valid question for trees that are still open to
security fixes but not general backports.

For these cases, I would just follow a simple rule of thumb:
- is the submitter willing to provide the backport?
- is the backport low-risk?
- is the underlying bug important?

If the answer to all is "yes" then I'd go with it.

In this case, given that the fix is a one-liner, and obviously correct,
I think it is worth considering.



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.