On 06/06/11 16:21, Keir Fraser wrote:
> On 06/06/2011 15:32, "Andrew Cooper" <andrew.cooper3@xxxxxxxxxx> wrote:
>
>> I am attempting to fix the kexec interactions with x2apic and iommu
>> functionality. Part of this involves ensuring that all IOMMU
>> functionality is disabled, as the kdump kernels are not happy at having
>> their interrupts remapped without their knowledge.
>>
>> I have introduced iommu_disable_x2apic_IR() onto the kexec path, but it
>> does not seem to actually disable interrupt remapping on Intel boxes
>> (Specifically the two Intel Nehalem boxes I am testing on).
>>
>> Specifying iommu=no-intremap on the commandline causes everything to
>> work correctly, but leaving it out causes the kdump kernel to hang and
>> eventually reboot, as can be seen on the attached serial log.
>>
>> The lines starting DBG: are extra debugging I have put in which shows
>> that the disable_IR() function is being called and writing to the registers.
> Should have attached your patch as well. Noone else can know with certainty
> where you put your debugging, and noone else is going to want to help debug
> your code if they can't even see it. :-)
>
> Also a good idea to Cc a likely person who can help (i.e., someone who wrote
> the code that you are querying). 'hg annotate' is useful for this -- in this
> case I am adding Weidong Han to the cc list.
>
> On the bright side, this must have been got working for S3 suspend/resume to
> work properly (indeed that's what the disable code was originally added
> for). So it can't be an insurmountable problem.
>
> -- Keir
>
>> This problem occurs with the XenServer version of 4.1.0 as well as on
>> xen-unstable at the moment.
>>
>> Is there any hardware state which is not taken down by the disable
>> function, any subtle interactions which I have not taken account of? I
>> have looked through the source and nothing pops out, but I am out of ideas.
>>
>> Thanks in advance,
>
Attached are the two relevant patches, and two which I don't think are
relevant but might be if I am wrong. crash_shutdown was an attempt to
make an iommu_ops which shut down all iommu functionality without saving
state. debug-wip shows where I have put in debug statements.
kdump-fix-x2apic and apic-record-boot-mode are also in the source, but I
believe them to be unrelated to the current problem.
I have done some further debugging on the assumption that the order of
shutting down interupt remapping matters with shutting down the lapics
and ioapics, but disable_qinval causes a panic (qinval.c:222 - "queue
invalidate wait descriptor was not executed\n") if it is run before both
the lapics and ioapics are shut down.
--
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com
crash_shutdown_iommu_ops.patch
Description: Text Data
debug-wip
Description: Text document
apic-record-boot-mode.patch
Description: Text Data
kexec-fix-x2apic.patch
Description: Text Data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|