[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] S3 crash with VTD Queue Invalidation enabled



>>> On 06.06.13 at 01:53, Ben Guthro <ben@xxxxxxxxxx> wrote:
>> Early in the boot process, I see queue_invalidate_wait() called for
>> DRHD unit 0, and 1
>> (unit 0 is wired up to the IGD, unit 1 is everything else)
>>
>> Up until i915 does the following, I see that unit being flushed with
>> queue_invalidate_wait() :
>>
>> [    0.704537] ENERGY_PERF_BIAS: Set to 'normal', was 'performance'
>> [    0.704537] ENERGY_PERF_BIAS: View and update with x86_energy_p
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> (XEN) XXX queue_invalidate_wait:282 CPU0 DRHD0 ret=0
>> [    1.983028] [drm] GMBUS [i915 gmbus dpb] timed out, falling back to
>> bit banging on pin 5
>> [    2.253551] fbcon: inteldrmfb (fb0) is primary device
>> [    3.111838] Console: switching to colour frame buffer device 170x48
>> [    3.171631] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
>> [    3.171634] i915 0000:00:02.0: registered panic notifier
>> [    3.173339] acpi device:00: registered as cooling_device1
>> [    3.173401] ACPI: Video Device [VID] (multi-head: yes  rom: no  post: no)
>> [    3.173962] input: Video Bus as
>> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input4
>> [    3.174232] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on 
>> minor 0
>> [    3.174258] ahci 0000:00:1f.2: version 3.0
>> [    3.174270] xen: registering gsi 19 triggering 0 polarity 1
>> [    3.174274] Already setup the GSI :19
>>
>>
>> After that - the unit never seems to be flushed.

With queue_invalidate_wait() having a single caller -
invalidate_sync() -, and with invalidate_sync() being called from
all interrupt setup (IO-APIC as well as MSI), that's quite odd to be
the case. At least upon network driver load or interface-up, this
should be getting called.

>> ...until we enter into the S3 hypercall, which loops over all DRHD
>> units, and explicitly flushes all of them via iommu_flush_all()
>>
>> It is at that point that it hangs up when talking to the device that
>> the IGD is plumbed up to.
>>
>>
>> Does this point to something in the i915 driver doing something that
>> is incompatible with Xen?
> 
> I actually separated it from the S3 hypercall, adding a new debug key
> 'F' - to just call iommu_flush_all()
> I can crash it on demand with this.
> 
> Booting with "i915.modeset=0 single" (to prevent both KMS, and Xorg) -
> it does not occur.
> So, that pretty much narrows it down to the IGD, in my mind.

Which reminds me of a change I did several weeks back to our kernel,
but which isn't as easily done with pv-ops: There are a number of
cases in the AGP and DRM code that qualify upon CONFIG_INTEL_IOMMU
and use intel_iommu_gfx_mapped. As you certainly know, Linux when
running on Xen doesn't see any IOMMU, and hence the config option
being enabled or disabled is completely unrelated to whether the
driver actually runs on top of an enabled IOMMU. Similarly the setting
of intel_iommu_gfx_mapped cannot possibly happen when running on
top of Xen, as it sits in code that never gets used in this case.

A possibly simple, but rather hacky solution might be to always set
that variable when running on Xen. But that wouldn't cover the case
of a kernel being built without CONFIG_INTEL_IOMMU, yet in that
case the driver might still run with an IOMMU enabled underneath.
(In our case I can simply always #define intel_iommu_gfx_mapped
to 1, with the INTEL_IOMMU option getting forcibly disabled for the
Xen kernel flavors anyway. Whether that's entirely correct when
not running on an enabled IOMMU I can't tell yet, and don't know
whom to ask.)

And that wouldn't cover the IGD getting passed through to a DomU
at all - obviously Xen's ability to properly drive all IOMMU operations
(including qinval) must not depend on the owning guest's driver code.

I have to admit though that it entirely escapes me why a graphics
driver needs to peek into IOMMU code/state in the first place. This
very much smells of bad design.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.