WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] Fatal crash: ACPI assigning duplicate physical IRQ's to

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-users] Fatal crash: ACPI assigning duplicate physical IRQ's to second DomU
From: Hilton Day <xlot@xxxxxxxxxxxxxxxxx>
Date: Tue, 10 Oct 2006 17:53:38 +1000
Delivery-date: Tue, 10 Oct 2006 00:55:07 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <452B15FA.3050409@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <452B15FA.3050409@xxxxxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.7 (Windows/20060909)
More info - here is lspci and the contents of /proc/interrupts for each dom.

Dom0
####
[root@dns ~]# cat /proc/interrupts
          CPU0
 1:          8        Phys-irq  i8042
 3:          2        Phys-irq  ehci_hcd:usb3
 4:          0        Phys-irq  ohci_hcd:usb2
 8:          1        Phys-irq  rtc
10:        303        Phys-irq  eth2
11:       4347        Phys-irq  ohci_hcd:usb1, libata
12:        113        Phys-irq  i8042
14:       2112        Phys-irq  ide0
15:        416        Phys-irq  ide1
256:       5804     Dynamic-irq  timer0
257:          0     Dynamic-irq  resched0
258:          0     Dynamic-irq  callfunc0
259:          0     Dynamic-irq  xenbus
260:          0     Dynamic-irq  console
NMI:          0
LOC:          0
ERR:          0
MIS:          0
####

DomU - this has the problem (see IRQ 11)
####
[root@intranet ~]# cat /proc/interrupts
          CPU0
11:       1393        Phys-irq  eth1
256:       3073     Dynamic-irq  timer0
257:          0     Dynamic-irq  resched0
258:          0     Dynamic-irq  callfunc0
259:        104     Dynamic-irq  xenbus
260:        147     Dynamic-irq  xencons
261:       1799     Dynamic-irq  blkif
262:        103     Dynamic-irq  eth0
NMI:          0
LOC:          0
ERR:          0
MIS:          0
####

DomU (this is the other DomU that works fine)
####
[root@gateway ~]# cat /proc/interrupts
          CPU0
 5:         17        Phys-irq  eth1
256:       3111     Dynamic-irq  timer0
257:          0     Dynamic-irq  resched0
258:          0     Dynamic-irq  callfunc0
259:        104     Dynamic-irq  xenbus
260:        170     Dynamic-irq  xencons
261:       1859     Dynamic-irq  blkif
262:        130     Dynamic-irq  eth0
NMI:          0
LOC:          0
ERR:          0
MIS:          0
####

Dom0 PCI (using pciback to reserve 01:09.0 and 01:0a.0. 01:0a.0 is getting the problem IRQ)
####
[root@dns ~]# lspci
00:00.0 Host bridge: nVidia Corporation nForce2 AGP (different version?) (rev c1)
00:00.1 RAM memory: nVidia Corporation nForce2 Memory Controller 1 (rev c1)
00:00.2 RAM memory: nVidia Corporation nForce2 Memory Controller 4 (rev c1)
00:00.3 RAM memory: nVidia Corporation nForce2 Memory Controller 3 (rev c1)
00:00.4 RAM memory: nVidia Corporation nForce2 Memory Controller 2 (rev c1)
00:00.5 RAM memory: nVidia Corporation nForce2 Memory Controller 5 (rev c1)
00:01.0 ISA bridge: nVidia Corporation nForce2 ISA Bridge (rev a3)
00:01.1 SMBus: nVidia Corporation nForce2 SMBus (MCP) (rev a2)
00:02.0 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.1 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:02.2 USB Controller: nVidia Corporation nForce2 USB Controller (rev a3)
00:08.0 PCI bridge: nVidia Corporation nForce2 External PCI Bridge (rev a3)
00:09.0 IDE interface: nVidia Corporation nForce2 IDE (rev a2)
00:1e.0 PCI bridge: nVidia Corporation nForce2 AGP (rev c1)
01:08.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
01:09.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
01:0a.0 Ethernet controller: Lite-On Communications Inc LNE100TX (rev 20)
01:0b.0 RAID bus controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) 02:00.0 VGA compatible controller: ATI Technologies Inc Radeon R300 ND [Radeon 9700 Pro] 02:00.1 Display controller: ATI Technologies Inc Radeon R300 [Radeon 9700 Pro] (Secondary)
####


Hilton Day wrote:
Hi,

I've got a problem with ACPI assigning duplicate physical IRQ's to one
of my DomU's that I'm passing a PCI NIC to.  Can anyone shed some light
into ways I can avoid this problem with IRQ allocation?  I can see the
irq allocations using /proc/interrupts and see the conflict.

In my Dom0 I have 3 network cards.  eth0 and eth1 are identical
tulip-based 100MB cards, and eth2 is a realtek gigabit card that I'm
using as the xen-bridge. I have this problem with a variety of different
kernels - currently running kernel-xen-2.6.18-1 (fedora core 6
development tree) on all hosts, with xen-3.0.2.

Pass-through always works just fine for one of my DomU's, and ACPI
allocates an unused physical IRQ with no problems.  However, in a second
DomU, it consistently allocates the same IRQ as is used by my onboard
SATA controller (libata).

When the second DomU is running, I get a fatal crash that also destroys
my RAID volume info, and severely damages the filesystem.  I've had to
manually rebuild the raid each time this happens, so that I can try a
new alternative solution.  The crash typically happens within a few
minutes of booting.

After reading the archives of this list, as well as other lists, I've
tried putting "noirqdebug" as a kernel parameter in both the Dom0 and
DomU, and also made use of "noapic" and "acpi=off", as well as disabling
ACPI in my motherboard's bios (system is an athlonxp running on an
nforce2 motherboard with 2 gigs of ram).  None of them resolves the
conflict - it appears to be a bug that affects pass-through of PCI
devices and IRQ allocation?

I've also tried a variety of other ethernet devices (the forcedeth
driver for nforce2 onboard nic, and also natsemi driver for a Netgear
FA311) to pass through to the second DomU, with the same result.  Moving
the PCI card to a different PCI bus address/slot doesn't resolve the
problem either.

I managed to grab Dmesg outputs from dom0 and the problem DomU last time
it crashed -

The message I'm getting to console in the domU is:
####
irq 11: nobody cared (try booting with the "irqpoll" option)
[<c040569e>] dump_trace+0x69/0x1af
[<c04057fc>] show_trace_log_lvl+0x18/0x2c
[<c0405d9c>] show_trace+0xf/0x11
[<c0405dcb>] dump_stack+0x15/0x17
[<c044636e>] __report_bad_irq+0x36/0x7d
[<c044655b>] note_interrupt+0x1a6/0x1e3
[<c0445bda>] __do_IRQ+0xba/0xf2
[<c0406c2c>] do_IRQ+0x9e/0xbc
=======================
handlers:
[<d10636e8>] (tulip_interrupt+0x0/0xdb8 [tulip])
Disabling IRQ #11
end_request: I/O error, dev xvda, sector 42806344
Buffer I/O error on device xvda3, logical block 5061623
lost page write due to I/O error on xvda3
####

The dmesg output from the Dom0 following booting of the second DomU is:

####
PCI: Enabling device 0000:01:0a.0 (0000 -> 0003)
ACPI: PCI Interrupt 0000:01:0a.0[A] -> Link [LNK3] -> GSI 11 (level,
low) -> IRQ 11
ADDRCONF(NETDEV_CHANGE): vif4.0: link becomes ready
xenbr0: port 4(vif4.0) entering learning state
xenbr0: topology change detected, propagating
xenbr0: port 4(vif4.0) entering forwarding state
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata1.00: (BMDMA stat 0x64)
ata1.00: tag 0 cmd 0x25 Emask 0x4 stat 0x40 err 0x0 (timeout)
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: (BMDMA stat 0x64)
ata2.00: tag 0 cmd 0xca Emask 0x4 stat 0x40 err 0x0 (timeout)
ata1: soft resetting port
ata2: soft resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2.00: revalidation failed (errno=-5)
ata2: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata2: hard resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1: failed to recover some devices, retrying in 5 secs
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2.00: revalidation failed (errno=-5)
ata2: failed to recover some devices, retrying in 5 secs
ata1: hard resetting port
ata2: hard resetting port
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1.00: revalidation failed (errno=-5)
ata1.00: disabled
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2.00: revalidation failed (errno=-5)
ata2.00: disabled
ata1: EH complete
ata2: EH complete
sd 0:0:0:0: SCSI error: return code = 0x00040000
end_request: I/O error, dev sda, sector 58152585
raid5:md3: read error not correctable (sector 46682112 on sda5).
raid5: Disk failure on sda5, disabling device. Operation continuing on 1
devices
raid5:md3: read error not correctable (sector 46682120 on sda5).
####


Please, any help in resolving this appreciated - I'd like to get this
host up and running!

Thanks,

Hilton.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>