[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] signall 7 crash



well, i also met the qemu-dm crash due to signal 7, but the stack of coredump is different.

#0  0x00000000004ae0c3 in memcpy_words (dst=0x7f35cc5c5008, src="" n=1) at exec-dm.c:492
#1  0x00000000004ae3c2 in cpu_physical_memory_rw (_addr=1068576768, buf=0x7f35cc5c5008 <Address 0x7f35cc5c5008 out of bounds>, _len=1, 
    is_write=0) at exec-dm.c:613
#2  0x00000000004af60f in read_physical (addr=1068576768, size=1, val=0x7f35cc5c5008) at helper2.c:322
#3  0x00000000004af6ae in cpu_ioreq_move (env=0xb1e710, req=0x7f35cc5c5000) at helper2.c:374
#4  0x00000000004afa0c in __handle_ioreq (env=0xb1e710, req=0x7f35cc5c5000) at helper2.c:449
#5  0x00000000004afca3 in cpu_handle_ioreq (opaque=0xb1e710) at helper2.c:515
#6  0x000000000040bdd2 in main_loop_wait (timeout=10) at /usr/src/redhat/BUILD/xen-4.0.1/tools/ioemu-dir/vl.c:3788
#7  0x00000000004afe86 in main_loop () at helper2.c:576
#8  0x000000000040f1ab in main (argc=25, argv=0x7fff2ece6df8, envp=0x7fff2ece6ec8)
    at /usr/src/redhat/BUILD/xen-4.0.1/tools/ioemu-dir/vl.c:6150

the stack seems the read operation of addr 1068576768 shouldn't be walked  to helper2.c:374. 

the crashed vm is windows 2003 64 r2, installed windows gpl pvdriver.

i found the Xen has four mmio range,
    &hpet_mmio,
    &vlapic_mmio,
    &vioapic_mmio,
    &msixtbl_mmio

addr 1068576768 isn't in this four range, and also doesn't exist in qemu's mmio range

i guess the addr is invalid for the operation.

Could anybody have some idea or give some advice? 



Regards,
wanjia

2011/7/14 jbuy0710 <juby0710@xxxxxxxxx>
Hi ,

We encounter a signal 7 crash in our enviroment with xen 4.0.1 one time.
We start about 10~15 VMs at the same time and stop them after about 15
minutes repeately.

I found the following log in xend.log wehn we want to stop this VM.
[2011-07-13 06:50:12 12757] WARNING (image:552) domain
5f244213-2dc1-4700-9714-6b20c3f85fc4: device model failure: pid 4110:
died due to signal 7 (core dumped); see
/var/log/xen/qemu-dm-5f244213-2dc1-4700-9714-6b20c3f85fc4.log
[2011-07-13 06:50:12 12757] WARNING (XendDomainInfo:2071) Domain has
crashed: name=5f244213-2dc1-4700-9714-6b20c3f85fc4 id=144.

The xml file of our VMs is like this
<domain type='xen' id='274'>
 <name>45087c07-c929-4e41-bc11-215fa7088ee5</name>
 <uuid>45087c07-c929-4e41-bc11-215fa7088ee5</uuid>
 <memory>256000</memory>
 <currentMemory>256000</currentMemory>
 <vcpu cpuset='1-3'>1</vcpu>
 <os>
   <type>hvm</type>
   <loader>/usr/lib/xen/boot/hvmloader</loader>
   <boot dev='hd'/>
 </os>
 <features>
   <acpi/>
   <apic/>
   <pae/>
 </features>
 <clock offset='utc'/>
 <on_poweroff>destroy</on_poweroff>
 <on_reboot>restart</on_reboot>
 <on_crash>destroy</on_crash>
 <devices>
   <emulator>/usr/lib64/xen/bin/qemu-dm</emulator>
   <disk type='file' device='disk'>
     <driver name='file'/>
     <source file='/07dbc62b-3bd7-41d7-865a-6632019e2f2b'/>
     <target dev='hda' bus='ide'/>
   </disk>
   <interface type='bridge'>
     <mac address='00:16:3e:00:01:00'/>
     <source bridge='teprod'/>
     <script path='/etc/xen/scripts/vif-bridge'/>
     <target dev='vif274.0'/>
     <model type='e1000'/>
   </interface>
   <serial type='pty'>
     <source path='/dev/pts/3'/>
     <target port='0'/>
   </serial>
   <console type='pty' tty='/dev/pts/3'>
     <source path='/dev/pts/3'/>
     <target type='serial' port='0'/>
   </console>
   <input type='tablet' bus='usb'/>
   <input type='mouse' bus='ps2'/>
   <graphics type='vnc' port='5903' autoport='yes' keymap='en-us'/>
 </devices>
</domain>


We try to use the core dump file of qemu-dm and gdb to find the root
cause, it shows
(gdb) bt
#0  0x00000000004765d4 in memcpy_words (_addr=<value optimized out>,
buf=<value optimized out>, _len=<value optimized out>, is_write=1)
   at exec-dm.c:492
#1  cpu_physical_memory_rw (_addr=<value optimized out>, buf=<value
optimized out>, _len=<value optimized out>, is_write=1) at
exec-dm.c:581
#2  0x00000000004292ad in cpu_physical_memory_write
(opaque=0x7fce7d55d010, buf=<value optimized out>, size=64) at
../cpu-all.h:932
#3  e1000_receive (opaque=0x7fce7d55d010, buf=<value optimized out>,
size=64) at /root/rpmbuild/BUILD/xen-4.0.1/tools/ioemu-dir/hw/e1000.c:640
#4  0x0000000000497c11 in qemu_send_packet (vc1=0x11dc9c0,
buf=0x7fff0186ba10 "\377\377\377\377\377\377\270\254o}\202\343\b\006",
size=60) at net.c:412
#5  0x00000000004985a8 in tap_send (opaque=<value optimized out>) at net.c:751
#6  0x000000000040895c in main_loop_wait (timeout=<value optimized
out>) at /root/rpmbuild/BUILD/xen-4.0.1/tools/ioemu-dir/vl.c:3788
#7  0x000000000047798a in main_loop () at helper2.c:576
#8  0x000000000040ccd1 in main (argc=<value optimized out>,
argv=<value optimized out>, envp=<value optimized out>)
   at /root/rpmbuild/BUILD/xen-4.0.1/tools/ioemu-dir/vl.c:6150


Is this a known issue or is there any patch to fix it?

Thanks

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.