|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] dom0 got load of 80
There is one other possibility that could be going on here.
I saw a very similar failure mode to what Heiko saw once, when I
had a domU mistakenly try to do hardware Raid maintenance and scanning
from inside domU. Not pretty. that sent the dom0 load very high
and eventually led to a crash of everything.
Steve Timm
On Wed, 28 Jan 2009, Heiko wrote:
Hello,
last night I had a dom0 that was reported down by our monitoring and
it got available again after 50 minutes.
I also had message that some domU on that machine where not online.
Can it be that one domU has that much load that it can take down all
the others inlc. dom0?
right after the dom0 got available again i could see that it had a
high load: "CRITICAL - load average: 0.18, 5.97, 81.42"
I remember watching somethin similiar on another dom0, i could see
there that there where some python processes going wild.
In the logfile for the problem machine from today i find this:
Is this Xen related, there is nothing else on the machine running?
My machine are CentOS 5.2 only with packages from the officiall
repositories(xen 3.1.0).
Jan 28 06:07:32 x1blade1 kernel: python invoked oom-killer:
gfp_mask=0x201d2, order=0, oomkilladj=0
Jan 28 06:07:32 x1blade1 kernel:
Jan 28 06:07:32 x1blade1 kernel: Call Trace:
Jan 28 06:07:32 x1blade1 kernel: [<ffffffff802b4896>] out_of_memory+0x8b/0x203
Jan 28 06:07:32 x1blade1 kernel: [<ffffffff8020f05e>] __alloc_pages+0x22b/0x2b4
Jan 28 06:07:32 x1blade1 kernel: [<ffffffff802129fb>]
__do_page_cache_readahead+0xd0/0x21c
Jan 28 06:07:32 x1blade1 kernel: [<ffffffff802606a8>]
__wait_on_bit_lock+0x5b/0x66
Jan 28 06:07:33 x1blade1 nrpe[345]: Error: Could not complete SSL handshake. 5
Jan 28 06:07:33 x1blade1 nrpe[342]: Error: Could not complete SSL handshake. 5
Jan 28 06:07:33 x1blade1 kernel: [<ffffffff8023fd18>] __lock_page+0x5e/0x64
Jan 28 06:07:36 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53536
Jan 28 06:08:35 x1blade1 kernel: [<ffffffff802132c0>]
filemap_nopage+0x148/0x322
Jan 28 06:09:10 x1blade1 kernel: [<ffffffff80208ba1>]
__handle_mm_fault+0x3d9/0xf4d
Jan 28 06:09:25 x1blade1 kernel: [<ffffffff80261869>]
_spin_lock_irqsave+0x9/0x14
Jan 28 06:10:28 x1blade1 kernel: [<ffffffff802641bf>]
do_page_fault+0xe4c/0x11e0
Jan 28 06:13:01 x1blade1 kernel: [<ffffffff8025d823>] error_exit+0x0/0x6e
Jan 28 06:13:54 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53536
Jan 28 06:15:16 x1blade1 kernel:
Jan 28 06:16:23 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53536
Jan 28 06:18:38 x1blade1 kernel: Mem-info:
Jan 28 06:18:50 x1blade1 snmpd[4492]: Connection from UDP: [172.17.3.161]:50744
Jan 28 06:18:50 x1blade1 kernel: DMA per-cpu:
Jan 28 06:18:50 x1blade1 snmpd[4492]: Received SNMP packet(s) from
UDP: [172.17.3.161]:50744
Jan 28 06:18:50 x1blade1 kernel: cpu 0 hot: high 186, batch 31 used:73
...repeats a lot
Jan 28 06:18:54 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:54 x1blade1 kernel: HighMem per-cpu: empty
Jan 28 06:18:54 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:54 x1blade1 kernel: Free pages: 16204kB (0kB HighMem)
Jan 28 06:18:54 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:54 x1blade1 kernel: Active:970160 inactive:1278826
dirty:2 writeback:0 unstable:0 free:4051 slab:39078 mapped-file:1131
mapped-anon:2248568 pagetables:167956
Jan 28 06:18:55 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:55 x1blade1 kernel: DMA free:16204kB min:16204kB
low:20252kB high:24304kB active:3883200kB inactive:5112872kB
present:16411728kB pages_scanned:22590050 all_unreclaimable? yes
Jan 28 06:18:55 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:55 x1blade1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 28 06:18:55 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:55 x1blade1 kernel: DMA32 free:0kB min:0kB low:0kB
high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Jan 28 06:18:55 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:55 x1blade1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 28 06:18:55 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:55 x1blade1 kernel: Normal free:0kB min:0kB low:0kB
high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Jan 28 06:18:56 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:56 x1blade1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 28 06:18:56 x1blade1 snmpd[4492]: Connection from UDP: [172.17.4.161]:53542
Jan 28 06:18:56 x1blade1 kernel: HighMem free:0kB min:128kB low:128kB
high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
all_unreclaimable? no
Jan 28 06:18:56 x1blade1 kernel: lowmem_reserve[]: 0 0 0 0
Jan 28 06:18:56 x1blade1 kernel: DMA: 25*4kB 7*8kB 5*16kB 1*32kB
1*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 3*4096kB = 16204kB
Jan 28 06:18:56 x1blade1 kernel: DMA32: empty
Jan 28 06:18:56 x1blade1 kernel: Normal: empty
Jan 28 06:18:57 x1blade1 kernel: HighMem: empty
Jan 28 06:18:57 x1blade1 kernel: Swap cache: add 513883, delete
513883, find 28240/28470, race 0+0
Jan 28 06:18:57 x1blade1 kernel: Free swap = 0kB
Jan 28 06:18:57 x1blade1 kernel: Total swap = 2048276kB
Jan 28 06:18:57 x1blade1 kernel: Free swap: 0kB
Jan 28 06:18:57 x1blade1 kernel: 4102932 pages of RAM
Jan 28 06:18:57 x1blade1 kernel: 97982 reserved pages
Jan 28 06:18:57 x1blade1 kernel: 753073 pages shared
Jan 28 06:18:58 x1blade1 kernel: 0 pages swap cached
Jan 28 06:18:58 x1blade1 kernel: Out of memory: Killed process 6991 (python).
greetings
.r
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
--
------------------------------------------------------------------
Steven C. Timm, Ph.D (630) 840-8525
timm@xxxxxxxx http://home.fnal.gov/~timm/
Fermilab Computing Division, Scientific Computing Facilities,
Grid Facilities Department, FermiGrid Services Group, Assistant Group Leader.
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
<Prev in Thread] |
Current Thread |
[Next in Thread> |
Re: [Xen-users] dom0 got load of 80,
Steven Timm <=
|
|
|
|
|