WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] XCP: Crashes on dual Xeon HP ProLiant systems

To: "dwight at supercomputer.org" <dwight@xxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] XCP: Crashes on dual Xeon HP ProLiant systems
From: Pasi Kärkkäinen <pasik@xxxxxx>
Date: Fri, 30 Apr 2010 21:20:07 +0300
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 30 Apr 2010 11:20:52 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <201004300932.37495.dwight@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <201004300932.37495.dwight@xxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Fri, Apr 30, 2010 at 09:32:37AM -0700, dwight at supercomputer.org wrote:
> Is anyone else running the latest XCP on HP ProLiant DL380 
> systems? Or a similar dual Xeon 8-core system? I'm seeing 
> spontaneous reboots when under a load.
> 
> Specifically, when 4 Windows HVMs are loaded, I haven't noticed
> any reboots yet. But when running 7 or 8, the system will
> reboot within minutes. Very little information appears on
> the console.
> 
> I built a debugging version of the hypervisor, which changed
> the behavior; the system managed to stay up for 2-3 hours
> with 7 VMs running. However, it again spontaneously rebooted,
> with no real messages on the console as to why.
> 
> I can send out the console log messages this evening, along
> with the system information if there's interest. Alas, I
> don't have access to these items at the moment.
> 
> I have also been running memtest86 overnight. As of 1.5 hours into
> the test, there were no errors. But there are 48 GB of RAM
> on the system, so the testing wasn't complete when I left.
> 
> Any suggestions here? I was going to build a 32-bit kernel
> from the latest patches, but it appears Centos 5.4 Xen is 
> also not stable on these systems. I had trouble getting
> the kernel to build here, with various errors. The most
> notable of which was:
> 
> ----------------------
> CC      arch/x86/kernel/acpi/processor.o
> In file included from arch/x86/kernel/acpi/processor.c:8:
> include/linux/kernel.h:185: internal compiler error: Segmentation 
> fault
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugzilla.redhat.com/bugzilla> for instructions.
> The bug is not reproducible, so it is likely a hardware or OS 
> problem.
> make[2]: *** [arch/x86/kernel/acpi/processor.o] Error 1
> make[1]: *** [arch/x86/kernel/acpi] Error 2
> make: *** [arch/x86/kernel] Error 2
> ----------------------
> 

Uhm.. the compiler really shouldn't crash.

Are you sure your hardware is OK? If the stock EL5.4 Xen also crashes,
it could be broken hardware? 

Did you try running memtest86+ ? 

Is baremetal Linux stable, if you run for example 
"make -j8 bzImage && make -j8 modules && make clean" kernel build in a loop? 

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>