WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

RE: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

To: VPS Lime <vpslime@xxxxxxxxx>
Subject: RE: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array
From: Eric van Blokland <Eric@xxxxxxxxxxxx>
Date: Mon, 18 Oct 2010 15:31:40 +0200
Accept-language: nl-NL
Acceptlanguage: nl-NL
Cc: "xen-users@xxxxxxxxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 18 Oct 2010 08:28:32 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTimPwwsb_7oAr_70VUytPSO61_r1Zz_DcdFhunDm@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTikdZxj4wYb+AW3By08B=K=VA0OBeHqpMYH0g_OP@xxxxxxxxxxxxxx> <32D516B498BAE9439B6502E86F31007A01BA76E6DE3A@domain> <AANLkTimPwwsb_7oAr_70VUytPSO61_r1Zz_DcdFhunDm@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Actux6E1iKy5HhcjQsiJoYn51z2dUwAEGK5A
Thread-topic: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

Not sure if this is the cause of your issue. Because I just see messages of VM’s getting started. Nothing about why they could have crashed.

 

Be sure to check it’s really the VMs crashing. Perhaps the entire server just rebooted. If not, try to get dmesg from when the VMs crashed. You can also do “xm dmesg” to see if the hypervisor has anything to tell you.

 

About the memory squeeze. I believe this has to do with Dom0 running low on memory, not sure though. You could try giving Dom0 a reasonable fixed amount of memory.

 

Also be sure you’re not over allocating memory. (Not sure if you even can in Xen, I guess you might, never tried).

 

Van: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] Namens VPS Lime
Verzonden: maandag 18 oktober 2010 17:16
CC: xen-users@xxxxxxxxxxxxxxxxxxx
Onderwerp: Re: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

 

Good suggestion on dmesg.  The "memory squeeze in netback driver" seems like a likely culprit.  There is a bug (http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=762) dating back several years on this issue with some suggestions and other responses that did not work.  Has anyone come up with a reliable fix for this on CentOS 5.5?

 

xen_net: Memory squeeze in netback driver.

xen_net: Memory squeeze in netback driver.

device xen3.128 entered promiscuous mode

ADDRCONF(NETDEV_UP): xen3.128: link is not ready

printk: 60 messages suppressed.

xen_net: Memory squeeze in netback driver.

blkback: ring-ref 8, event-channel 15, protocol 1 (x86_64-abi)

printk: 11 messages suppressed.

xen_net: Memory squeeze in netback driver.

ADDRCONF(NETDEV_CHANGE): xen3.120: link becomes ready

xenbr1: topology change detected, propagating

xenbr1: port 36(xen3.120) entering forwarding state

ADDRCONF(NETDEV_CHANGE): xen3.123: link becomes ready

xenbr1: topology change detected, propagating

xenbr1: port 41(xen3.123) entering forwarding state

device tap2 entered promiscuous mode

xenbr1: topology change detected, propagating

xenbr1: port 43(tap2) entering forwarding state

device xen1-112 entered promiscuous mode

ADDRCONF(NETDEV_UP): xen1-112: link is not ready

tap2: no IPv6 routers present

device tap5 entered promiscuous mode

xenbr1: topology change detected, propagating

xenbr1: port 45(tap5) entering forwarding state

device xen3.109 entered promiscuous mode

ADDRCONF(NETDEV_UP): xen3.109: link is not ready

tap5: no IPv6 routers present

printk: 8 messages suppressed.

xen_net: Memory squeeze in netback driver.

xen_net: Memory squeeze in netback driver.

xenbr1: port 46(xen3.109) entering disabled state

device xen3.109 left promiscuous mode

xenbr1: port 46(xen3.109) entering disabled state

xenbr1: port 45(tap5) entering disabled state

device tap5 left promiscuous mode

xenbr1: port 45(tap5) entering disabled state

device xen3.129 entered promiscuous mode

ADDRCONF(NETDEV_UP): xen3.129: link is not ready

blkback: ring-ref 8, event-channel 15, protocol 1 (x86_64-abi)

ADDRCONF(NETDEV_CHANGE): xen3.129: link becomes ready

xenbr1: topology change detected, propagating

xenbr1: port 45(xen3.129) entering forwarding state

nfs: server 10.1.1.45 not responding, still trying

nfs: server 10.1.1.45 not responding, still trying

nfs: server 10.1.1.45 OK

 

 

 

 

 

 

On Mon, Oct 18, 2010 at 8:44 AM, Eric van Blokland <Eric@xxxxxxxxxxxx> wrote:

I’ve seen this happening in the past, when iSCSI disks became inaccessible. Hasn’t occurred for quite a while though (while I know I made these disk inaccessible quite a few times), however, your system appears to be up to date.

 

 If it is caused by disks becoming inaccessible, you should see something about it in dmesg, “connection …. timeout".

 

Van: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] Namens VPS Lime
Verzonden: maandag 18 oktober 2010 16:32
Aan: xen-users@xxxxxxxxxxxxxxxxxxx
Onderwerp: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

 

I inherited a xen server that is setup to have all the VM images hosted on an iSCSI mounted NAS array.  We been experiencing a random (about every 2-3 days) issue where xen would crash all the VMs, leaving nothing but the Domain0 running.  What appears to be happening is something causes the iSCI mount to hiccup.  Running "vgchange -a y" and restarting all the VMs brings everything up.  Nothing appears to be wrong with the NAS array - there are a dozen other servers attached to it that never have a problem.  The xend log does not have anything useful in it and I'm at a loss to figure out what is causing this.  The only suggestion I've heard is maybe the memory usage is too high and it is causing the box to be unstable.  If anyone has any suggestions or any additional logs I should be looking at, I'd really appreciate it.

 

Host OS: CentOS 5.5

Xen kernel: xen.gz-2.6.18-194.11.4.el5

iSCSI libraries: iscsi-initiator-utils-6.2.0.871-0.16.el5

Memory on server: 32G

Total memory allocated for VMs running paravirt: 19,384 M

Total memory allocated for VMs running HVM: 2,688 M

 

Results of xm top:

xentop - 10:11:06   Xen 3.1.2-194.11.4.el5

39 domains: 1 running, 38 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown

Mem: 25165116k total, 25150528k used, 14588k free    CPUs: 8 @ 1995MHz

      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR SSID

  Domain-0 -----r         1583   17.1    3220540   12.8   no limit       n/a     8   32     1932    32747    0        0        0        0    0

 

 

 

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
<Prev in Thread] Current Thread [Next in Thread>