[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] qemu-upstream triggering OOM killer



On Fri, 17 Feb 2017, Jan Beulich wrote:
> >>> On 16.02.17 at 19:38, <sstabellini@xxxxxxxxxx> wrote:
> > On Thu, 16 Feb 2017, Jan Beulich wrote:
> >> >>> On 16.02.17 at 16:23, <JBeulich@xxxxxxxx> wrote:
> >> >>>> On 14.02.17 at 15:56, <anthony.perard@xxxxxxxxxx> wrote:
> >> >> On Fri, Feb 10, 2017 at 02:54:23AM -0700, Jan Beulich wrote:
> >> >>> Not so far. It appears to happen when grub clears the screen
> >> >>> before displaying its graphical menu, so I'd rather suspect an issue
> >> >>> with a graphics related change (the one you pointed out isn't).
> >> >> 
> >> >> I tried to reproduce this, by limiting the amount of memory available to
> >> >> qemu using cgroups, but about 44MB of memory is enough to boot a guest
> >> >> (tried Ubuntu and Debian).
> >> > 
> >> > Okay, not a qemuu regression after all, but a libxc one. It just so
> >> > happens that qemut tries to allocate a much larger amount, which
> >> > triggers mmap() failure earlier and hence doesn't manage to trigger
> >> > the oom killer. Patch (almost) on its way.
> >> 
> >> Patch sent, allowing that guest to get further (and Windows to
> >> properly boot). However, now the guest is stuck right at the point
> >> where X wants to switch to its designated video mode, with qemu
> >> (for somewhere between half a minute and a minute) consuming
> >> one full CPU's bandwidth. Once qemu's CPU consumption went
> >> down, no further progress is being made though.
> >> 
> >> Again I'd be thankful for hints on how to debug such a situation.
> > 
> > I would bisect it. It's probably due to a change in the cirrus vga code
> > or common vga code. It might be worth testing with stdvga=1 to narrow it
> > down.
> 
> No need to bisect - I finally remembered the behavior matching a
> regression I had spotted back in December with a security backport
> to one of our older trees. Commit 913a87885f ("display: cirrus:
> ignore source pitch value as needed in blit_is_unsafe") needs
> backporting.

Done


> Considering that this has been around for a while, it raises another
> question: Are regression fixes being actively looked for by the two
> of you, or are we depending on people running into issues for
> necessary fixes to be pulled in?

Anthony often looks at osstest results. I try to make sure either me or
somebody else is looking at outstanding bugs and regressions. In this
case for example, Anthony offered to help. I backported another fix to a
bug reported by Boris just yesterday. But for this to happen, we need to
know there is a regression in the first place. With the wide range of
guests and QEMU options available, it is not surprising that bugs slip
through. For example, I never test with Windows guests, I don't even a
license anymore.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.