WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] megasas stops I/O when running kernel as dom0 under xen4

On Wed, Aug 24, 2011 at 05:57:06PM +0100, Andrew Cooper wrote:
> On 24/08/11 13:06, Andrew Cooper wrote:
> > On 22/08/11 10:05, Andrew Cooper wrote:
> >> On 19/08/11 19:10, Andreas Olsowski wrote:
> >>> Am 19.08.2011 18:49, schrieb Andrew Cooper:
> >>>
> >>>> The only change you need to make is in megasas_probe_one() in
> >>>> megaraid_sas_base.c
> >>>>
> >>>> Add a call to pci_enable_msi(pdev) immediately after the current
> >>> call to
> >>>> pci_set_master(pdev);
> >>>>
> >>>> ~Andrew
> >>>>
> >>> Yep, that works fine. Removed the module option as well.
> >>>
> >>> root@tarballerina:~# cat /proc/interrupts  |grep mega
> >>> 2236:      69010          0          0          0          0         
> >>> 0          0          0  xen-pirq-msi       megasas
> >>>
> >>> The same procedure that would have lead to almost instant errors has
> >>> not brought them to appear again.
> >>>
> >> Good.  This is what we are seeing as well.  I am still awaiting a reply
> >> from LSI on this topic.
> >>
> >> Unfortunately, this does point to a regression in the way Xen deals with
> >> legacy interrupts.
> > Out of interest, on all 3 of your boxes with the megaraid_sas cards,
> > could you gather the io_apic information?
> >
> > It is the z xen debug key on the serial console (or alternatively put
> > apic_verbosity=debug on the xen commandline and the information gets
> > dumped into the dmesg)
> 
> You can ignore this - it is not relevant.
> 
> I have narrowed the problem to a bug in the interrupt migration code.

Goodies!
> 
> The bug occurs when the move pending flag is set, and somehow another
> interrupt comes in on the old pcpu without triggering the move
> completion code.  This leaves the IO_APIC with ack'd but not EOI'd
> interrupt from the megaraid_sas device.

Ah, so the interrupt is delievered to Dom0 on the old per_cpu
event which is ignored. Ignored b/c we have rebinded the event channel
to the other CPU, right?

Is there any code in the Hypervisor to turn off interrupt migration code?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>