[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] memory error?



> I have just noticed this message in my kernel logs, reporting the possibility 
> of an error with my memory. This would go a long way towards explaining the 
> problems i've been having. This particular error is occuring when i'm not 
> running xen so is obviously not something brought on by xen itself.
> 
> The strange thing is that the NMI error is always followed by the TLAN: eth0: 
> Adaptor Error = 0x180002, which says to me that either there is something 
> wrong with my network card which is triggering an NMI, or that the NMI 
> triggers an error in that network adapter. The memory itself is ECC memory in 
> a Compaq Proliant 1600, maybe i can access the memory logs...
> 
> Either way, what would xen do upon receiving an NMI? Would it spontaneously 
> reboot?

Hmm, given that it's not something we've ever been able to test,
'spontaneous reboot' sounds quite possible...

In normal operation, it's relatively hard for Xen to reboot
without printing anything. It requires a 'triple fault', which
basically means the hypervisor area of the pagetable has to be
corrupt. We haven't seen a bug like that for a very long time.

The link between the NMI and the adaptor error is interesting. I
wander if its a parity error on the PCI bus rather than a memory
ECC failure? Try re-seating the PCI card?

Ian

 
> I'm running memtest now, and will run memtest86 once I am back in the office.
> 
> James
> 
> eth2: Promiscuous mode enabled.
> eth2: Promiscuous mode enabled.
> br2: port 1(eth2) entering learning state
> br2: port 1(eth2) entering forwarding state
> br2: topology change detected, propagating
> Uhhuh. NMI received. Dazed and confused, but trying to continue
> You probably have a hardware problem with your RAM chips
> TLAN:  eth0: Adaptor Error = 0x180002
> TLAN: eth0: Starting autonegotiation.
> TLAN: eth0: Autonegotiation complete.
> TLAN: eth0: Link active with AutoNegotiation enabled, at 100Mbps Full-Duplex
> TLAN: Partner capability: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD
> TLAN:  eth0: Adaptor Error = 0x180002
> TLAN: eth0: Starting autonegotiation.
> TLAN: eth0: Autonegotiation complete.
> TLAN: eth0: Link active with AutoNegotiation enabled, at 100Mbps Full-Duplex
> TLAN: Partner capability: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD
 -=- MIME -=- 
--_9DD09D3F-D9F9-4632-8493-BCC48EBC0856_
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

I have just noticed this message in my kernel logs, reporting the possibili=
ty of an error with my memory. This would go a long way towards explaining =
the problems i've been having. This particular error is occuring when i'm n=
ot running xen so is obviously not something brought on by xen itself.

The strange thing is that the NMI error is always followed by the TLAN: eth=
0: Adaptor Error =3D 0x180002, which says to me that either there is someth=
ing wrong with my network card which is triggering an NMI, or that the NMI =
triggers an error in that network adapter. The memory itself is ECC memory =
in a Compaq Proliant 1600, maybe i can access the memory logs...

Either way, what would xen do upon receiving an NMI? Would it spontaneously=
 reboot?

I'm running memtest now, and will run memtest86 once I am back in the offic=
e.

James

eth2: Promiscuous mode enabled.
eth2: Promiscuous mode enabled.
br2: port 1(eth2) entering learning state
br2: port 1(eth2) entering forwarding state
br2: topology change detected, propagating
Uhhuh. NMI received. Dazed and confused, but trying to continue
You probably have a hardware problem with your RAM chips
TLAN:  eth0: Adaptor Error =3D 0x180002
TLAN: eth0: Starting autonegotiation.
TLAN: eth0: Autonegotiation complete.
TLAN: eth0: Link active with AutoNegotiation enabled, at 100Mbps Full-Duple=
x
TLAN: Partner capability: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD
TLAN:  eth0: Adaptor Error =3D 0x180002
TLAN: eth0: Starting autonegotiation.
TLAN: eth0: Autonegotiation complete.
TLAN: eth0: Link active with AutoNegotiation enabled, at 100Mbps Full-Duple=
x
TLAN: Partner capability: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD

--_9DD09D3F-D9F9-4632-8493-BCC48EBC0856_
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<HTML dir=3Dltr><HEAD></HEAD>
<BODY>
<DIV><FONT face=3DArial color=3D#000000 size=3D2>I have just noticed this m=
essage in my kernel logs, reporting the possibility of an error with my mem=
ory. This would go a long way towards explaining the problems i've been hav=
ing. This particular error is occuring when i'm not running xen so is obvio=
usly not something brought on by xen itself.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>The&nbsp;strange thing is that the NMI err=
or is always followed by the TLAN: eth0: Adaptor Error =3D 0x180002, which =
says to me that either there is something wrong with my network card which =
is triggering an NMI, or that the NMI triggers an error in that network ada=
pter. The memory itself is ECC memory in a Compaq Proliant 1600, maybe i ca=
n access the memory logs...</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>Either way, what would xen do upon receivi=
ng an NMI? Would it spontaneously reboot?</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>I'm running memtest now, and will run memt=
est86 once I am back in the office.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial size=3D2>James</FONT></DIV>
<DIV><FONT face=3DArial color=3D#000000 size=3D2></FONT>&nbsp;</DIV>
<DIV><FONT face=3DArial color=3D#000000 size=3D2>eth2: Promiscuous mode ena=
bled.<BR>eth2: Promiscuous mode enabled.<BR>br2: port 1(eth2) entering lear=
ning state<BR>br2: port 1(eth2) entering forwarding state<BR>br2: topology =
change detected, propagating<BR>Uhhuh. NMI received. Dazed and confused, bu=
t trying to continue<BR>You probably have a hardware problem with your RAM =
chips<BR>TLAN:&nbsp; eth0: Adaptor Error =3D 0x180002<BR>TLAN: eth0: Starti=
ng autonegotiation.<BR>TLAN: eth0: Autonegotiation complete.<BR>TLAN: eth0:=
 Link active with AutoNegotiation enabled, at 100Mbps Full-Duplex<BR>TLAN: =
Partner capability: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD<BR>TLAN=
:&nbsp; eth0: Adaptor Error =3D 0x180002<BR>TLAN: eth0: Starting autonegoti=
ation.<BR>TLAN: eth0: Autonegotiation complete.<BR>TLAN: eth0: Link active =
with AutoNegotiation enabled, at 100Mbps Full-Duplex<BR>TLAN: Partner capab=
ility: 10BaseT-HD 10BaseT-FD 100baseTx-HD 100baseTx-FD<BR></DIV></FONT>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

--_9DD09D3F-D9F9-4632-8493-BCC48EBC0856_--


-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel



-------------------------------------------------------
This SF.Net email is sponsored by OSTG. Have you noticed the changes on
Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now,
one more big change to announce. We are now OSTG- Open Source Technology
Group. Come see the changes on the new OSTG site. www.ostg.com
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.