On Mon, Jul 20, 2009 at 6:13 PM, Mike Lovell<mike@xxxxxxxxxxxx> wrote:
> Chris wrote:
>>
>> Quoting Mike Lovell <mike@xxxxxxxxxxxx>:
>>
>>> Chris wrote:
>>>>
>>>> Quoting Mike Lovell <mike@xxxxxxxxxxxx>:
>>>>
>>>>> On 7/18/2009 6:57 PM, Chris wrote:
>>>>>>
>>>>>> Hello.
>>>>>>
>>>>>> I am having a peculiar problem. I am running a dual amd64 system on
>>>>>> a Tyan Tomcat h1000s S3950 motherboard with 4 Gigs of ram and two SATA
>>>>>> 3.0
>>>>>> g/s seagate drives (in RAID 1 configuration using LVM2). I am using
>>>>>> Gentoo Linux, current as of today.
>>>>>>
>>>>>> When I boot the system, everything looks good until it starts the
>>>>>> drives, then the system chokes and spits out a bunch of errors like
>>>>>> these:
>>>>>> --------------------
>>>>>>
>>>>>> ACPI: PCI Interrupt 0000:01:0e.0[A] -> GSI 11 (level, low) -> IRQ 11
>>>>>> ata1: SATA max UDMA/133 cmd 0xffffc2000002c000 ctl
>>>>>> 0xffffc2000002c020 bmdma 0xffffc2000002c031
>>>>>> ata2: SATA max UDMA/133 cmd 0xffffc2000002c100 ctl
>>>>>> 0xffffc2000002c120 bmdma 0xffffc2000002c131
>>>>>> ata3: SATA max UDMA/133 cmd 0xffffc2000002c200 ctl
>>>>>> 0xffffc2000002c220 bmdma 0xffffc2000002c231
>>>>>> ata4: SATA max UDMA/133 cmd 0xffffc2000002c300 ctl
>>>>>> 0xffffc2000002c320 bmdma 0xffffc2000002c331
>>>>>> scsi0: sata_svw
>>>>>> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>>>> ata1.00: ATA-7: ST3250410AS, 3.AAC, max UDMA/133
>>>>>> ata1.00: 488397168 sectors, multi 16: LBA48 NCQ (depth 0/32)
>>>>>> ata1.00: qc timeout (cmd0xef)
>>>>>> ata1.00: failed to set xfermode (err_mask=0x4)
>>>>>> ata1: failed to recover some devices retrying in 5 secs
>>>>>> (starts over with SATA link up 1.5 gbps...)
>>>>>> ------------------------------------------
>>>>>> Eventually the system gives up (after struggling for like 10
>>>>>> minutes) and locks up hard.
>>>>>>
>>>>>> I found that if I disabled APIC in the BIOS, it will boot -- but as
>>>>>> soon as I try to start the network, the logs fill with more of the
>>>>>> same
>>>>>> junk, and the system locks up hard.
>>>>>>
>>>>>> I installed a regular gentoo kernel, and used the exact same config
>>>>>> file -- the system boots just fine. No problems what-so-ever. So,
>>>>>> this
>>>>>> is definitely a Xen thing...
>>>>>>
>>>>>> Anything I can do to fix this?
>>>>>>
>>>>>> btw: using kernel version 2.6.21-xen -- the most recent kernel in
>>>>>> portage -- and xen 3.4.0
>>>>>
>>>>> Do you get these same errors on an non-xenified kernel? Also, do the
>>>>> errors always contains errors complaining about ata1 or scsi0? If this
>>>>> is the case, then the first disk on the controller is bad and probably
>>>>> should be replaced. These errors are usually the driver detecting a
>>>>> problem disk and trying to handle the errors. Hope that answers your
>>>>> question.
>>>>>
>>>>> mike
>>>>
>>>> As I stated, I have installed a regular gentoo kernel with the same
>>>> exact .config file, and I do NOT have these problems. Only with Xen.
>>>> The
>>>> controller is fine... Gotta be a Xen issue. It stinks of an IRQ or some
>>>> similar conflict to me...
>>>>
>>>> Any other ideas?
>>>>
>>>> Thanks.
>>>
>>> My bad for not completely reading your email. Sry.
>>>
>>> Was the regular kernel you used also a 2.6.21 kernel? I am assuming so
>>> since you said you used the exact same .config file and that is always
>>> changing between kernel versions. I have never used a controller that
>>> uses the sata_svw driver. One other option would be to use a newer
>>> kernel. There may not be a newer version in portage, but there are
>>> people that have made newer working tarballs.
>>>
>>> http://lists.xensource.com/archives/html/xen-users/2009-06/msg00200.html
>>> http://code.google.com/p/gentoo-xen-kernel/
>>>
>>> Hunting around the mailing list archives should turn up some additional
>>> resources. Hopefully these sites or the archives can help you out.
>>>
>>> mike
>>
>> Bad news -- I added the overlay for 2.6.30-r2 xen patched kernel, used the
>> same .config and recompiled... it won't boot. Goes right to a a black
>> screen, then reboots. Tried remaking my initrd several times a few
>> different ways, just to see (a total shot in the dark) -- but no dice.
Try the patches for 2.6.29 at
http://code.google.com/p/gentoo-xen-kernel/downloads/list, they will
apply to 2.6.29.6 and .30 is known to be buggy, in fact I've just
changed the comment to suggest using .29 instead because so many
people reported problems and I've not been able to fix them all.
Andy
>>
>> Also, I downloaded a fresh, vanilla 2.6.21 kernel for kernel.org, copied
>> over the same .config file I've been using -- and the system boots without
>> complaint. The problem is definitely stemming from Xen. It either doesn't
>> like the broadcom servworks sata_svw SATA chipset, or I'm having an IRQ
>> conflict (or both). Since it will boot with APIC turned off in the bios,
>> I'm thinking the latter. Starting up the network (with APIC off) causes the
>> conflict. Booting with the APIC turned on in the bios, the conflict just
>> happens sooner... (well, as good a theory as any other...)
>>
>> I have now put in about 30 hours on getting this thing to work. I have
>> been a very big Xen fan... but I am sad to say, since I'm on a deadline
>> here, I may have to look to KVM or VMware... :(
>>
>> I appreciate the input, Mike. If anyone has any great ideas here, I'd
>> love to hear them, and soon!
>>
>> Thanks.
>>
>> Chris
>>
>>
> Which version of the xen hypervisor do you have installed? If you have an
> older one, the current xenified kernel release might not work.
>
> mike
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
>
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|