On Thu, Nov 11, 2010 at 06:13:29PM +0000, Mark Adams wrote:
> On Thu, Nov 11, 2010 at 12:58:09PM -0500, Konrad Rzeszutek Wilk wrote:
> > On Thu, Nov 11, 2010 at 05:38:50PM +0000, Mark Adams wrote:
> > > On Thu, Nov 11, 2010 at 11:53:40AM -0500, Konrad Rzeszutek Wilk wrote:
> > > > On Thu, Nov 11, 2010 at 10:24:17AM +0000, Mark Adams wrote:
> > > > > Hi All,
> > > > >
> > > > > Running xen 4.0.1-rc6, debian squeeze 2.6.32-21.
> > > > >
> > > > > In a voip setup, where I have forwarded the onboard NIC interfaces
> > > > > through to domU using the following grub config:
> > > > >
> > > > > module /vmlinuz-2.6.32-5-xen-amd64 placeholder
> > > > > root=UUID=25c3ac79-6850-498d-afcf-ea42970e94fd ro quiet
> > > > > xen-pciback.permissive xen-pciback.hide=(02:00.0)(03:00.0)
> > > > > pci=resource_alignment=02:00.0;03:00.0
> > > > >
> > > > > I'm having a serious issue where the raid card goes offline after an
> > > > > indefinate period of time. Sometimes runs fine for a week, other
> > > > > times 1
> > > > > day before I get "offline device" errors. Rebooting the machine fixes
> > > > > it
> > > > > straight away, and everything is back online.
> > > > >
> > > > > What in the Xen pciback is causing the raid card to go offline? The
> > > > > only devices hidden are the 2 onboard NIC's.
> > > >
> > > > You need to give more details. Is the RAID card a 3Ware? An LSI? Do you
> > > > run with an IOMMU? When the RAID card goes offline, do you see a stop of
> > > > IRQs going to the device? Are the IRQs for the RAID card sent to all of
> > > > your
> > > > CPUs or just a specific one? Are you pinning your guests to specific
> > > > CPUs?
> > > > Does the issue disappear if you don't passthrough the NIC interfaces?
> > > > If so have
> > > > you run this setup for "a week" to make sure?
> > >
> > > It is an Areca 1220. I can't see anything when the device goes offline
> > > apart from
> > >
> > > [77324.264270] sd 0:0:0:1: rejecting I/O to offline device
> > > [77334.005854] sd 0:0:0:0: rejecting I/O to offline device
> >
> > That is it? No other details from the driver? Did you poke at the driver
> > (modinfo)
> > to see if there are any options to increase its verbosity.
>
> I can't do anything once its happened, everything is offline so I have
> no utils...
> >
> > >
> > > Unfortunately nothing get's logged because there is nothing to write to
> > > anymore. I'm not sure how I can see the IRQs otherwise. There is no
> >
> > cat /proc/interrupts
> >
> > > pinning being done at all, and the machine was running for a few months
> > > OK before the pciback was added.
> >
> > Ok, what about your NICs? Are they on-board? Are they sharing the IRQ
> > with the card? You should be able to see this by looking at
> > /proc/interrupts.
> > Which NICs are they? lspci can you help you there. As of matter of fact, run
> > lspci -vvv and send that.
>
> It is the onboard nics, they are Intel 82574L. I can see the arcmsr
> line, but not anything for the NICS (because they are hidden?)
>
> 39: 1126249 0 0 0 0 0
> 0 0 xen-pirq-ioapic-level arcmsr
>
> Nothing else is on 1126249
>
> see lspci.txt attached.
>
I've just noticed this at the end of xm dmesg
(XEN) msi.c:715: MSI is already in use on device 02:00.0
(XEN) msi.c:715: MSI is already in use on device 02:00.0
(XEN) msi.c:715: MSI is already in use on device 02:00.0
Something else trying to use the device being exported? (the nics are
02:00.0 and 03:00.0)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|