[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: i915 dma faults on Xen


  • To: Jason Andryuk <jandryuk@xxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 22 Feb 2021 11:18:07 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TDEZAqvOuOUDzy7reVKXP2Cxh0gmJnOMxrdxjFePi4k=; b=DKWwfz7W4SYD1Y9JruaDyE4CnnmE0Nk1I4dbUPJQlRPnwiCz98HE/jcEnR0DkTtoaMWW8HJlSn8rY0JLShw22GDcOJAWU2y+CN1yQDrJXA5227r9sLv8+RQYZaWQqY8JfD0dqdoODfXwqpUZIefQml0MOdqGMjLoTcMZHZzj7p9Hf5V9atKVWlst6rhjCa3y6NeFa6zd9JwsVi6gTX2qubH0YVGTCZt8gJmaGW3cgtspIc/hdrW3AqARhFvgDuKY/S3oakad1CcLQrvyfo7NL072CxvI5icPH0tL7NJENlq8xteBprhx/1uZ6rl46ogZgkpzldb37SmvO4rM7M963Q==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mFR7OQkD+1fqoh5cJFRlDT05llPXilOQrlyU5nmoAV5OnYP8hQwkpbIZGaI/5dUf/TG3Ftv2zaiO6r7K/hsV9ij5P2j6DvsvD3ujqPDtSRQMjvgTVgqhlDGciZZB/YnFYO+IQGTw9az53/Zq32LsvgalCh7ShjJZBCaVx6e3MX5hFt17CIg0vbyJhv92knx7MMz+zzwxWA2AJ0h9BYCzpyEXhWt0EnXOCogW1POoRqs5K8yw3uN6nj0updmE0ydHt55FrsVYN5nuofYlg7+t0bdVA0eBXjfK4OaO7+15axAqouXJkOEbw9CDAxt0b8yaXqtOHtQ+jGVXu+Zb5fJehQ==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, <intel-gfx@xxxxxxxxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, eric chanudet <eric.chanudet@xxxxxxxxx>
  • Delivery-date: Mon, 22 Feb 2021 10:18:37 +0000
  • Ironport-sdr: HlacU/Q282O7oSGayqyDunGI1sO5TLbsmIAkhNxDbld4Pn+EQUewy6RYP6LZArAbuJyYhZ0Uhc H76fHuIsoMr/agFt5DHGVT500fnUvUatDHLkIJ/MPyzN6zxczVRhlFrnXVIO6XD4NflEKvmC3d kw8iwGB06NkrJw0TuVUCEYs5DeGglIxmLlITcHZdNHztAQhbVTK6g17Tlj+KxpzHzsY6aRMtiP iIaJg1vwqpqnfOa5bEHpIkwjGoZ2FDCoxuLf8Kx4c57g1od0KUe0uRKLtkhot+22Zg+btb9P1Y 0UU=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Feb 19, 2021 at 12:30:23PM -0500, Jason Andryuk wrote:
> On Wed, Oct 21, 2020 at 9:59 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >
> > On 21.10.2020 15:36, Jason Andryuk wrote:
> > > On Wed, Oct 21, 2020 at 8:53 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> > >>
> > >> On 21.10.2020 14:45, Jason Andryuk wrote:
> > >>> On Wed, Oct 21, 2020 at 5:58 AM Roger Pau Monné <roger.pau@xxxxxxxxxx> 
> > >>> wrote:
> > >>>> Hm, it's hard to tell what's going on. My limited experience with
> > >>>> IOMMU faults on broken systems there's a small range that initially
> > >>>> triggers those, and then the device goes wonky and starts accessing a
> > >>>> whole load of invalid addresses.
> > >>>>
> > >>>> You could try adding those manually using the rmrr Xen command line
> > >>>> option [0], maybe you can figure out which range(s) are missing?
> > >>>
> > >>> They seem to change, so it's hard to know.  Would there be harm in
> > >>> adding one to cover the end of RAM ( 0x04,7c80,0000 ) to (
> > >>> 0xff,ffff,ffff )?  Maybe that would just quiet the pointless faults
> > >>> while leaving the IOMMU enabled?
> > >>
> > >> While they may quieten the faults, I don't think those faults are
> > >> pointless. They indicate some problem with the software (less
> > >> likely the hardware, possibly the firmware) that you're using.
> > >> Also there's the question of what the overall behavior is going
> > >> to be when devices are permitted to access unpopulated address
> > >> ranges. I assume you did check already that no devices have their
> > >> BARs placed in that range?
> > >
> > > Isn't no-igfx already letting them try to read those unpopulated 
> > > addresses?
> >
> > Yes, and it is for the reason that the documentation for the
> > option says "If specifying `no-igfx` fixes anything, please
> > report the problem." I imply from in in particular that one
> > better wouldn't use it for non-development purposes of whatever
> > kind.
> 
> I stopped seeing these DMA faults, but I didn't know what made them go
> away.  Then when working with an older 5.4.64 kernel, I saw them
> again.  Eric bisected down to the 5.4.y version of mainline linux
> commit:
> 
> commit 8195400f7ea95399f721ad21f4d663a62c65036f
> Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
> Date:   Mon Oct 19 11:15:23 2020 +0100
> 
>     drm/i915: Force VT'd workarounds when running as a guest OS
> 
>     If i915.ko is being used as a passthrough device, it does not know if
>     the host is using intel_iommu. Mixing the iommu and gfx causes a few
>     issues (such as scanout overfetch) which we need to workaround inside
>     the driver, so if we detect we are running under a hypervisor, also
>     assume the device access is being virtualised.

So the commit above fixes the DMA faults seen on Linux when using a
i915 gfx card?

Thanks for digging into this.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.