[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] frequently ballooning results in qemu exit




> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxx [mailto:xen-devel-
> bounces@xxxxxxxxxxxxx] On Behalf Of Tim Deegan
> Sent: 2013年3月14日 22:34
> To: Hanweidong
> Cc: George Dunlap; Andrew Cooper; Yanqiangjun; xen-devel@xxxxxxxxxxxxx;
> Gonglei (Arei); Anthony PERARD
> Subject: Re: [Xen-devel] frequently ballooning results in qemu exit
> 
> At 14:10 +0000 on 14 Mar (1363270234), Hanweidong wrote:
> > > >> The call trace:
> > > >> Program received signal SIGBUS, Bus error.
> > > >> 0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > > >> (gdb) bt
> > > >> #0  0x00007f94f74773d7 in memcpy () from /lib64/libc.so.6
> > > >> #1  0x00007f94fa67016d in address_space_rw (as=<optimized out>,
> > > addr=2042531840, buf=0x7fffa36accf8 "", len=4, is_write=true) at
> > > /usr/include/bits/string3.h:52
> > > >> #2  0x00007f94fa747cf0 in rw_phys_req_item (rw=<optimized out>,
> > > val=<optimized out>, i=<optimized out>, req=<optimized out>,
> > > addr=<optimized out>)
> > > >>     at /opt/new/tools/qemu-xen-dir/xen-all.c:709
> > > >> #3  write_phys_req_item (val=<optimized out>, i=<optimized out>,
> > > req=<optimized out>, addr=<optimized out>) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:720
> > > >> #4  cpu_ioreq_pio (req=<optimized out>) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:736
> > > >> #5  handle_ioreq (req=0x7f94fa464000) at /opt/new/tools/qemu-
> xen-
> > > dir/xen-all.c:793
> > > >> #6  0x00007f94fa748abe in cpu_handle_ioreq
> (opaque=0x7f94fb39d3f0)
> > > at /opt/new/tools/qemu-xen-dir/xen-all.c:868
> > > >> #7  0x00007f94fa5e3262 in qemu_iohandler_poll
> > > (readfds=0x7f94faeea7a0 <rfds>, writefds=0x7f94faeea820 <wfds>,
> > > xfds=<optimized out>, ret=<optimized out>) at iohandler.c:125
> > > >> #8  0x00007f94fa5ec51d in main_loop_wait (nonblocking=<optimized
> > > out>) at main-loop.c:418
> > > >> #9  0x00007f94fa6616dc in main_loop () at vl.c:1770
> > > >> #10 main (argc=<optimized out>, argv=<optimized out>,
> > > envp=<optimized out>) at vl.c:3999
> > > >>
> > > >> It looks mapcache has something wrong because memcpy failed with
> the
> > > address from mapcache. Any ideas about this issue? Thanks!
> > > >
> > > > Which version of Xen and qemu are you using?  In particular,
> > > > qemu-upstream (aka qemu-xen) or qemu-traditional?  And what guest
> are
> > > > you using?  Is there anything on the xen console (either via the
> > > > serial port or 'xl dmesg')?
> > > >
> > > > At first glance it looks like maybe qemu is trying to access, via
> the
> > > > mapcache, pages which have been ballooned out.  But it seems like
> it
> > > > should only be doing so in response to a guest request -- is this
> > > > correct, Anthony?
> > >
> > > Yes, this look like a guest IO request. One things I don't know is
> what
> > > happen if there is guest addresses present in the mapcache that
> have
> > > been balloon out, then but back in the guest, are those addresses
> in
> > > mapcache still correct?
> > >
> >
> > I'm also curious about this. There is a window between memory balloon
> out
> > and QEMU invalidate mapcache.
> 
> That by itself is OK; I don't think we need to provide any meaningful
> semantics if the guest is accessing memory that it's ballooned out.
> 
> The question is where the SIGBUS comes from: either qemu has a mapping
> of the old memory, in which case it can write to it safely, or it
> doesn't, in which case it shouldn't try.
> 

The error always happened at memcpy in if (is_write) branch in 
address_space_rw. 

We found that, after the last xen_invalidate_map_cache, the mapcache entry 
related to the failed address was mapped:
        ==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
        ==xen_remap_bucket== begin size=1048576 ,address_index=7a3
        ==xen_remap_bucket== end 
entry->paddr_index=7a3,entry->vaddr_base=2a2d9000,size=1048576,address_index=7a3
and then, QEMU can use the mapped address successfully for several times (more 
than 20) before the error occurred. 
      ...
        ==xen_map_cache== phys_addr=7a3c1ec0 size=0 lock=0
        ==xen_map_cache== return 2a2d9000+c1ec0=2a39aec0
        ==address_space_rw== ptr=2a39aec0
        ==xen_map_cache== phys_addr=7a3c1ec4 size=0 lock=0
        ==xen_map_cache==first return 2a2d9000+c1ec4=2a39aec4
        ==address_space_rw== ptr=2a39aec4
        ==xen_map_cache== phys_addr=7a3c1ec8 size=0 lock=0
        ==xen_map_cache==first return 2a2d9000+c1ec8=2a39aec8
        ==address_space_rw== ptr=2a39aec8
        ==xen_map_cache== phys_addr=7a3c1ecc size=0 lock=0
        ==xen_map_cache==first return 2a2d9000+c1ecc=2a39aecc
        ==address_space_rw== ptr=2a39aecc
        ==xen_map_cache== phys_addr=7a16c108 size=0 lock=0
        ==xen_map_cache== return 92a407000+6c108=2a473108
        ==xen_map_cache== phys_addr=7a16c10c size=0 lock=0
        ==xen_map_cache==first return 2a407000+6c10c=2a47310c
        ==xen_map_cache== phys_addr=7a16c110 size=0 lock=0
        ==xen_map_cache==first return 2a407000+6c110=2a473110
        ==xen_map_cache== phys_addr=7a395000 size=0 lock=0
        ==xen_map_cache== return 2a2d9000+95000=2a36e000
        ==address_space_rw== ptr=2a36e000
      here, the SIGBUS error occurred.


> Is this running on a linux kernel?  ISTR some BSD kernels would
> demand-populate foreign mappings, which might fail like this.
> 

It's linux kernel.

--weidong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.