WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Weird ram/dom-u limit problem

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] Weird ram/dom-u limit problem
From: David Halik <dhalik@xxxxxxxxxxxxxxx>
Date: Wed, 20 Jan 2010 14:20:33 -0500
Delivery-date: Wed, 20 Jan 2010 11:21:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Thunderbird/3.0

Hello all,

I've been having a very odd problem the last few days and I'm at a loss on how to proceed, so I thought I'd ask the user community. I'm hitting what *appears* to be some kind of wall on either the number of vm's or the amount of ram they're using (personally I think it's ram, but I don't see how).

We have two identical CentOS 5.4 xen servers (xen-3.0.3-94.el5_4.2) in production with 7 vm's and dom0 on each server. I'll post the full config on the bottom of the email, but here are the basics.

xentop - 13:58:32   Xen 3.1.2-164.6.1.el5
8 domains: 1 running, 7 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown
Mem: 16776420k total, 5582324k used, 11194096k free    CPUs: 4 @ 3000MHz

dom0_mem=1g
loops = 64

Now, if I attempt to install an 8th vm with 512MB or ram on either server the install runs for awhile and then suddenly load on the xen server begins to grow and grow and grow. Going from a load average: 0.09, 0.15, 0.10 to load average: 10.58, 7.88, 4.70. The box itself begin to become very slow and bogged down. top shows that it's because of cpu iowait, but I can't find any particular reason for it.

top - 13:35:18 up 63 days, 13:54,  3 users,  load average: 10.58, 7.88, 4.70
Tasks: 196 total,   1 running, 195 sleeping,   0 stopped,   0 zombie
Cpu(s): 0.0%us, 0.0%sy, 0.0%ni, 24.9%id, 74.5%wa, 0.0%hi, 0.0%si, 0.7%st

75% is ridiculously high and it continues to get worse until the install finally barely finishes after much longer than normal. An average install takes 20 minutes, they're taking around 35-40 now and I fear that if I add yet another vm it will be even worse. Once the install finishes the box returns to normal and I can boot the new vm normally as well.

If I shutdown one of the existing vm's using 512MB of ram, then the install works perfectly and the load never goes crazy. Turn the 7th back on , and installing the 8th is horrible again. I can duplicate this on both servers which is very weird, but I can't figure out *why* it's happening. The fact that both servers have the problem leads me to believe its a config problem.

I'm leaning towards something running out of ram and swapping hard, which is why load takes awhile to start climbing and the cpu wait gets very high, yet I have *more* than enough memory allocated to everything. The installs are very io intensive, but I can't understand why that 8th vm is causing hell.

Is there some weird limit to the amount of vm's or ram I can allocate? Does dom0 need more than 1g and that's why it's getting cranky? Is there a config option I'm not aware of?

Any help would be appreciated, this is really worrying me. Thanks.

Swap:
# free
             total       used       free     shared    buffers     cached
Mem:       1048576    1039176       9400          0      10924     777816
-/+ buffers/cache:     250436     798140
Swap:     10482404        188   10482216

Full config:

# xm info
host                   : xen2.rutgers.edu
release                : 2.6.18-164.6.1.el5xen
version                : #1 SMP Tue Nov 3 16:48:13 EST 2009
machine                : x86_64
nr_cpus                : 4
nr_nodes               : 1
sockets_per_node       : 2
cores_per_socket       : 2
threads_per_core       : 1
cpu_mhz                : 3000
hw_caps : 178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
total_memory           : 16383
free_memory            : 10931
node_to_cpu            : node0:0-3
xen_major              : 3
xen_minor              : 1
xen_extra              : .2-164.6.1.el5
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_pagesize           : 4096
platform_params        : virt_start=0xffff800000000000
xen_changeset          : unavailable
cc_compiler            : gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)
cc_compile_by          : mockbuild
cc_compile_domain      : centos.org
cc_compile_date        : Tue Nov  3 16:04:14 EST 2009
xend_config_format     : 2


--
================================
David Halik
System Administrator
OIT-CSS Rutgers University
dhalik@xxxxxxxxxxxxxxx
================================


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>