[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: smmuv1 breakage



Hi Rahul,

Do you have an opinion on how we should move forward on this?

Do you think it is OK to go for a full revert of "xen/arm: smmuv1:
Intelligent SMR allocation" or do you think it is best to go with an
alternative fix? If so, do you have something in mind?



On Tue, 15 Jun 2021, Stefano Stabellini wrote:
> On Tue, 15 Jun 2021, Rahul Singh wrote:
> > Hi Stefano
> > 
> > > On 15 Jun 2021, at 3:21 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> > > wrote:
> > > 
> > > Hi Rahul,
> > > 
> > > Unfortunately, after bisecting, I discovered a few more breakages due to
> > > your smmuv1 series (commits e889809b .. 3e6047ddf) on Xilinx ZynqMP. I
> > > attached the DTB as reference. Please note that I made sure to
> > > cherry-pick "xen/arm: smmuv1: Revert associating the group pointer with
> > > the S2CR" during bisection. So the errors are present also on staging.
> > > 
> > > The first breakage is an error at boot time in smmu.c#find_smmu_master,
> > > see log1. I think it is due to the lack of ability to parse the new smmu
> > > bindings in the old smmu driver.
> > > 
> > > After removing all the "smmus" and "#stream-id-cells" properties in
> > > device tree, I get past the previous error, everything seems to be OK at
> > > early boot, but I actually get SMMU errors as soon as dom0 starting
> > > using devices:
> > > 
> > > (XEN) smmu: /smmu@fd800000: Unexpected global fault, this could be serious
> > > (XEN) smmu: /smmu@fd800000:     GFSR 0x80000002, GFSYNR0 0x00000000, 
> > > GFSYNR1 0x00000877, GFSYNR2 0x00000000
> > 
> >  This fault is "Unidentified stream fault” for StreamID “ 0x877” that means 
> > SMMU SMR is not configured for streamID “0x877"
> > 
> > 
> > > [   10.419681] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
> > > [   10.426452] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> > > 
> > > Do you think you'll be able to help fix them?
> > > 
> > > 
> > > You should be able to reproduce the two issues using Xilinx QEMU (but to
> > > be honest I haven't tested it on QEMU yet, I was testing on real
> > > hardware):
> > > - clone and compile xilinx QEMU https://github.com/Xilinx/qemu.git
> > >  ./configure  --target-list=aarch64-softmmu
> > >  make
> > > - clone and build git://github.com/Xilinx/qemu-devicetrees.git
> > > - use the attached script to run it
> > >    - kernel can be upstream defconfig 5.10
> > > 
> > 
> > I tried to reproduce the issue on Xilinx QEMU as per the steps shared above 
> > but I am not observing any issue on Xilinx QEMU.
> 
> I tried on QEMU and it doesn't repro. I cannot explain why it works on
> QEMU and it fails on real hardware.
> 
> 
> > I also tested and confirmed on QEMU that SMMU is configured correctly 
> > for specifically StreamID “ 0x877” and for other streamIDs.
> > 
> > I check the xen.dtb shared by you and found out the there is no 
> > "stream-id-cells”
> > property in the master device but the "mmu-masters" property is present in 
> > the
> > smmu node. For legacy smmu binding we need both "stream-id-cells” and 
> > "mmu-masters”.
> > If you need to add the new smmu binding please add the "iommu-cells”
> > property in the smmu node and the “iommus” property in the master device.
> 
> In regards to the missing "stream-id-cells" property, I shared the wrong
> dtb before, sorry. I was running a number of tests and I might have
> picked the wrong file. The proper dtb comes with "stream-id-cells" for
> the 0x877 device, see attached.
> 
> 
> 
> > Can you please share the xen boot logs with me so that I can debug further 
> > why the error is observed? 
> 
> See attached. I did some debugging and discovered that it crashes while
> accessing master->of_node in find_smmu_master. If I revert your series,
> the crash goes away. It is very strange because your patches don't touch
> find_smmu_master or insert_smmu_master directly.
> 
> I did a git reset --hard on the commit "xen/arm: smmuv1: Add a stream
> map entry iterator" and it worked, which points to "xen/arm: smmuv1:
> Intelligent SMR allocation" being the problem, even if I have the revert
> cherry-picked on top. Maybe the revert is not reverting enough?
> 
> After this test, I switched back to staging and did:
> git revert 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a
> git revert 0435784cc75dcfef3b5f59c29deb1dbb84265ddb
> 
> And it worked! So the issue truly is that
> 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a doesn't revert "enough".
> See "full-revert" for the patch reverting the remaining code. That on
> top of staging fixes boot for me.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.