[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: smmuv1 breakage
On Wed, 23 Jun 2021, Rahul Singh wrote: > Hi Stefano, > > > On 23 Jun 2021, at 9:09 am, Rahul Singh <Rahul.Singh@xxxxxxx> wrote: > > > > Hi Stefano, > > > >> On 22 Jun 2021, at 10:06 pm, Stefano Stabellini <sstabellini@xxxxxxxxxx> > >> wrote: > >> > >> Hi Rahul, > >> > >> Do you have an opinion on how we should move forward on this? > >> > >> Do you think it is OK to go for a full revert of "xen/arm: smmuv1: > >> Intelligent SMR allocation" or do you think it is best to go with an > >> alternative fix? If so, do you have something in mind? > >> > > > > Sorry for the late reply I was working on another high-priority task. > > I will work on this will try to fix the issue. I will update you within 2-3 > > days. > > I again checked my patches and found out that while allocating SMR I by > mistake > allocated one SMR for each SMMU device but we have to allocate the number of > SMR based on supported stream matching register for each SMMU device. > > This might be causing the issue. As I don’t have any Xilinx hardware and on > QEMU/Juno issue is not reproducible.Can you please test the attached patch and > let me know if it works. Yes this solves the issue for me, thank you!! Acked-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> Tested-by: Stefano Stabellini <sstabellini@xxxxxxxxxx> > > > > Regards, > > Rahul > > > >> > >> > >> On Tue, 15 Jun 2021, Stefano Stabellini wrote: > >>> On Tue, 15 Jun 2021, Rahul Singh wrote: > >>>> Hi Stefano > >>>> > >>>>> On 15 Jun 2021, at 3:21 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> > >>>>> wrote: > >>>>> > >>>>> Hi Rahul, > >>>>> > >>>>> Unfortunately, after bisecting, I discovered a few more breakages due to > >>>>> your smmuv1 series (commits e889809b .. 3e6047ddf) on Xilinx ZynqMP. I > >>>>> attached the DTB as reference. Please note that I made sure to > >>>>> cherry-pick "xen/arm: smmuv1: Revert associating the group pointer with > >>>>> the S2CR" during bisection. So the errors are present also on staging. > >>>>> > >>>>> The first breakage is an error at boot time in smmu.c#find_smmu_master, > >>>>> see log1. I think it is due to the lack of ability to parse the new smmu > >>>>> bindings in the old smmu driver. > >>>>> > >>>>> After removing all the "smmus" and "#stream-id-cells" properties in > >>>>> device tree, I get past the previous error, everything seems to be OK at > >>>>> early boot, but I actually get SMMU errors as soon as dom0 starting > >>>>> using devices: > >>>>> > >>>>> (XEN) smmu: /smmu@fd800000: Unexpected global fault, this could be > >>>>> serious > >>>>> (XEN) smmu: /smmu@fd800000: GFSR 0x80000002, GFSYNR0 0x00000000, > >>>>> GFSYNR1 0x00000877, GFSYNR2 0x00000000 > >>>> > >>>> This fault is "Unidentified stream fault” for StreamID “ 0x877” that > >>>> means SMMU SMR is not configured for streamID “0x877" > >>>> > >>>> > >>>>> [ 10.419681] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK > >>>>> [ 10.426452] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready > >>>>> > >>>>> Do you think you'll be able to help fix them? > >>>>> > >>>>> > >>>>> You should be able to reproduce the two issues using Xilinx QEMU (but to > >>>>> be honest I haven't tested it on QEMU yet, I was testing on real > >>>>> hardware): > >>>>> - clone and compile xilinx QEMU https://github.com/Xilinx/qemu.git > >>>>> ./configure --target-list=aarch64-softmmu > >>>>> make > >>>>> - clone and build git://github.com/Xilinx/qemu-devicetrees.git > >>>>> - use the attached script to run it > >>>>> - kernel can be upstream defconfig 5.10 > >>>>> > >>>> > >>>> I tried to reproduce the issue on Xilinx QEMU as per the steps shared > >>>> above > >>>> but I am not observing any issue on Xilinx QEMU. > >>> > >>> I tried on QEMU and it doesn't repro. I cannot explain why it works on > >>> QEMU and it fails on real hardware. > >>> > >>> > >>>> I also tested and confirmed on QEMU that SMMU is configured correctly > >>>> for specifically StreamID “ 0x877” and for other streamIDs. > >>>> > >>>> I check the xen.dtb shared by you and found out the there is no > >>>> "stream-id-cells” > >>>> property in the master device but the "mmu-masters" property is present > >>>> in the > >>>> smmu node. For legacy smmu binding we need both "stream-id-cells” and > >>>> "mmu-masters”. > >>>> If you need to add the new smmu binding please add the "iommu-cells” > >>>> property in the smmu node and the “iommus” property in the master device. > >>> > >>> In regards to the missing "stream-id-cells" property, I shared the wrong > >>> dtb before, sorry. I was running a number of tests and I might have > >>> picked the wrong file. The proper dtb comes with "stream-id-cells" for > >>> the 0x877 device, see attached. > >>> > >>> > >>> > >>>> Can you please share the xen boot logs with me so that I can debug > >>>> further why the error is observed? > >>> > >>> See attached. I did some debugging and discovered that it crashes while > >>> accessing master->of_node in find_smmu_master. If I revert your series, > >>> the crash goes away. It is very strange because your patches don't touch > >>> find_smmu_master or insert_smmu_master directly. > >>> > >>> I did a git reset --hard on the commit "xen/arm: smmuv1: Add a stream > >>> map entry iterator" and it worked, which points to "xen/arm: smmuv1: > >>> Intelligent SMR allocation" being the problem, even if I have the revert > >>> cherry-picked on top. Maybe the revert is not reverting enough? > >>> > >>> After this test, I switched back to staging and did: > >>> git revert 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a > >>> git revert 0435784cc75dcfef3b5f59c29deb1dbb84265ddb > >>> > >>> And it worked! So the issue truly is that > >>> 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a doesn't revert "enough". > >>> See "full-revert" for the patch reverting the remaining code. That on > >>> top of staging fixes boot for me. > > > > >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |