[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: smmuv1 breakage


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>
  • From: Rahul Singh <Rahul.Singh@xxxxxxx>
  • Date: Wed, 23 Jun 2021 08:09:13 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lgZB6MUWnCUSvUpM580yoqja+zml5jHOHpd9ePJWOBs=; b=DMze14FT5MmPa1uw6UzSDHn/8aTiTquMh2PDzQgdLRnjzyslPbUpopvNobAeIrT6nLw+0Sa88MZd9LjeJbN7CniIFarQfEc02jmWVlrFYNpwbajisrJ4QXU+DFgXYbdl8q7JtbtfAXeoP+Kp2CCbA4h8ml6oUISGSdPoQTLHAi/YwOoDHDO1sBEYenGBllFcApRw2OlOR9+K0bDv3nanPCkd4ngswR1LqW3/hmnWio3LCq8ytqPecNA5fvFRFBe22y1tJ94OIQoRTKntfmrNdPwHO6tCKKhz9BI4SnE9Te7g5YJgSL8BATVF/Npt1jBE169mceDoLeg0LohRYQPKmw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NoPPcjUUFnSmheuizRFBwcWVG/1OBbN0NJUkRUUOOgq52YPGoLFzJ7hQ83ENbm86oi3WqVRWjKfhTRQ62TvzZQMnO/OV1GfgRuneZXPALwyBqGk6mRrZulCE8q9tyjXRwjr0pen9brCx19t9F4wTEY4kZ5YVRnc3ke5DUrqlKUFk7um+rshK+YoxCM+Lu3i2weXM2Hrnd6B5ayYSVSouyx3Pa1R0ZudcAMzTwn72AsESTxjbQW8/9RVbVUjfAmDiJEB9MYRCDNW/JssAPP4wyiOAp6IQSxt+WwNiXt6DtrPLJRGDnpIyIgtsG+/rS1iF2LuCvskaO9T1Bl2wiqhMsA==
  • Authentication-results-original: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=arm.com;
  • Cc: "edgar.iglesias@xxxxxxxxxx" <edgar.iglesias@xxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Julien Grall <julien@xxxxxxx>, "fnuv@xxxxxxxxxx" <fnuv@xxxxxxxxxx>
  • Delivery-date: Wed, 23 Jun 2021 08:09:52 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Nodisclaimer: true
  • Original-authentication-results: kernel.org; dkim=none (message not signed) header.d=none;kernel.org; dmarc=none action=none header.from=arm.com;
  • Thread-index: AQHXYY0v3XVdqEOvqESHjaQGLRWMRKsVQ9IAgACUkYCACrmTAIAAuQiA
  • Thread-topic: smmuv1 breakage

Hi Stefano,

> On 22 Jun 2021, at 10:06 pm, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
> wrote:
> 
> Hi Rahul,
> 
> Do you have an opinion on how we should move forward on this?
> 
> Do you think it is OK to go for a full revert of "xen/arm: smmuv1:
> Intelligent SMR allocation" or do you think it is best to go with an
> alternative fix? If so, do you have something in mind?
> 

Sorry for the late reply I was working on another high-priority task. 
I will work on this will try to fix the issue. I will update you within 2-3 
days. 

Regards,
Rahul

> 
> 
> On Tue, 15 Jun 2021, Stefano Stabellini wrote:
>> On Tue, 15 Jun 2021, Rahul Singh wrote:
>>> Hi Stefano
>>> 
>>>> On 15 Jun 2021, at 3:21 am, Stefano Stabellini <sstabellini@xxxxxxxxxx> 
>>>> wrote:
>>>> 
>>>> Hi Rahul,
>>>> 
>>>> Unfortunately, after bisecting, I discovered a few more breakages due to
>>>> your smmuv1 series (commits e889809b .. 3e6047ddf) on Xilinx ZynqMP. I
>>>> attached the DTB as reference. Please note that I made sure to
>>>> cherry-pick "xen/arm: smmuv1: Revert associating the group pointer with
>>>> the S2CR" during bisection. So the errors are present also on staging.
>>>> 
>>>> The first breakage is an error at boot time in smmu.c#find_smmu_master,
>>>> see log1. I think it is due to the lack of ability to parse the new smmu
>>>> bindings in the old smmu driver.
>>>> 
>>>> After removing all the "smmus" and "#stream-id-cells" properties in
>>>> device tree, I get past the previous error, everything seems to be OK at
>>>> early boot, but I actually get SMMU errors as soon as dom0 starting
>>>> using devices:
>>>> 
>>>> (XEN) smmu: /smmu@fd800000: Unexpected global fault, this could be serious
>>>> (XEN) smmu: /smmu@fd800000:     GFSR 0x80000002, GFSYNR0 0x00000000, 
>>>> GFSYNR1 0x00000877, GFSYNR2 0x00000000
>>> 
>>> This fault is "Unidentified stream fault” for StreamID “ 0x877” that means 
>>> SMMU SMR is not configured for streamID “0x877"
>>> 
>>> 
>>>> [   10.419681] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK
>>>> [   10.426452] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>>>> 
>>>> Do you think you'll be able to help fix them?
>>>> 
>>>> 
>>>> You should be able to reproduce the two issues using Xilinx QEMU (but to
>>>> be honest I haven't tested it on QEMU yet, I was testing on real
>>>> hardware):
>>>> - clone and compile xilinx QEMU https://github.com/Xilinx/qemu.git
>>>> ./configure  --target-list=aarch64-softmmu
>>>> make
>>>> - clone and build git://github.com/Xilinx/qemu-devicetrees.git
>>>> - use the attached script to run it
>>>>   - kernel can be upstream defconfig 5.10
>>>> 
>>> 
>>> I tried to reproduce the issue on Xilinx QEMU as per the steps shared above 
>>> but I am not observing any issue on Xilinx QEMU.
>> 
>> I tried on QEMU and it doesn't repro. I cannot explain why it works on
>> QEMU and it fails on real hardware.
>> 
>> 
>>> I also tested and confirmed on QEMU that SMMU is configured correctly 
>>> for specifically StreamID “ 0x877” and for other streamIDs.
>>> 
>>> I check the xen.dtb shared by you and found out the there is no 
>>> "stream-id-cells”
>>> property in the master device but the "mmu-masters" property is present in 
>>> the
>>> smmu node. For legacy smmu binding we need both "stream-id-cells” and 
>>> "mmu-masters”.
>>> If you need to add the new smmu binding please add the "iommu-cells”
>>> property in the smmu node and the “iommus” property in the master device.
>> 
>> In regards to the missing "stream-id-cells" property, I shared the wrong
>> dtb before, sorry. I was running a number of tests and I might have
>> picked the wrong file. The proper dtb comes with "stream-id-cells" for
>> the 0x877 device, see attached.
>> 
>> 
>> 
>>> Can you please share the xen boot logs with me so that I can debug further 
>>> why the error is observed? 
>> 
>> See attached. I did some debugging and discovered that it crashes while
>> accessing master->of_node in find_smmu_master. If I revert your series,
>> the crash goes away. It is very strange because your patches don't touch
>> find_smmu_master or insert_smmu_master directly.
>> 
>> I did a git reset --hard on the commit "xen/arm: smmuv1: Add a stream
>> map entry iterator" and it worked, which points to "xen/arm: smmuv1:
>> Intelligent SMR allocation" being the problem, even if I have the revert
>> cherry-picked on top. Maybe the revert is not reverting enough?
>> 
>> After this test, I switched back to staging and did:
>> git revert 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a
>> git revert 0435784cc75dcfef3b5f59c29deb1dbb84265ddb
>> 
>> And it worked! So the issue truly is that
>> 9f6cd4983715cb31f0ea540e6bbb63f799a35d8a doesn't revert "enough".
>> See "full-revert" for the patch reverting the remaining code. That on
>> top of staging fixes boot for me.


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.