WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] dom0 - oom-killer - memory leak somewhere ?

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] dom0 - oom-killer - memory leak somewhere ?
From: Adrien Urban <adrien.urban@xxxxxxxxxxxxxx>
Date: Sun, 13 Nov 2011 10:29:15 +0100
Delivery-date: Sun, 13 Nov 2011 01:30:35 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4EBBC9E3.7000909@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <4EBBC9E3.7000909@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110626 Iceowl/1.0b2 Icedove/3.1.11
Hello,

I work in a hosting company, we have tens of Xen dom0 running just fine,
but unfortunately we do have a few that get out of control.

Reported behaviour :
- dom0 uses more and more memory
- no process can be found using that memory
- at some point, oom killer kicks in, and kills everything, until even
ssh the box becomes hard
- when there is really no more process to kill, it crashes even more,
and we are forced to reboot

Configuration summary :
- dom0 with debian/stable, xen 4.0.1
- 512MB, or up to 2GB after some crash


I have tried to find something that differs between a working dom0 and a
buggy one, but didn't manage to find anything. Install from the same
template, same packages, same hardware (but serials and mac addresses).


I didn't manage to find anything about leak in dom0 ending up with oom
killer without doubt.

I tried to gather as much log as i thought could be helpful in
attachments[1].
Host bk - about to get a reboot, as xend already got killed
Host sw - 800MB/2GB used for nothing,

Attachments[1] contains :

- memory graph (by munin) - it might help to see the pattern of memory
usage

cat from :
- grub.cfg
- /proc/meminfo
- /proc/slabinfo
- /proc/vmstat
- /var/log/kern.log
- /var/log/xen/xend.log

Result from :
- dmesg
- dpkg -l
- free
- lsmod
- top
- vmstat
- xm info
- xm info -c


I'd appreciate any feedback about such behaviour, and would be happy to
provide additional information.
Those are productions servers, the only thing i'd really like to avoid
as much as possible is rebooting them for tests.


Regards,

-- 
Adrien URBAN

[1] Sent an email with files as attachments a few days ago, but it never
made the list.
Files can be found here : http://www.hagtheil.net/xen/oom/


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users