[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] CPU Lockup bug with the credit2 scheduler

On Mon, 2020-02-17 at 11:58 -0800, Sarah Newman wrote:
> On 1/7/20 6:25 AM, Alastair Browne wrote:
> > 
> > After the tests, we decided to stick with kernel and 4.12
> > Xen
> > for production use running credit1 as the default scheduler.
> One person CC'ed appears to be having the same experience, where the
> credit2 scheduler leads to lockups (in this case in the domU, not the
> dom0) under 
> relatively heavy load. It seems possible they may have the same root
> cause.
Yeah, well, if booting with `sched=credit` makes the problem disappear,
whatever the real root cause really is, it seems related to Credit2.

> I don't think there are, but have there been any patches since the
> 4.13.0 release which might have fixed problems with credit 2
> scheduler? If not, 
> what would the next step be to isolating the problem - a debug build
> of Xen or something else?
Yes, having a debug build of Xen running and providing, for instance,
the info that Juergen is asking for later in this thread, i.e.:

xl vcpu-list
/usr/lib/xen/bin/xenctx -C -S -s <domu-system-map> <domid>

And I'd add myself:

xl debug-keys r ; xl dmesg

And, in general, hypervisor logs when the problem occurs (I've gone
through the threads, and I don't think I have seen any, but maybe I
missed something?).


is also another way to have a look, from Dom0, at whether (and if yes,
which ones and how much) the vCPUs are busy.

> If there are no merged or proposed fixes soon, it may be worth
> considering making the credit scheduler the default again until
> problems with the 
> credit2 scheduler are resolved.
Nothing similar to what is being described has happened in our testing
(or we wouldn't have switched to Credit2, of course! :-D).

I will see about trying to reproduce this myself, but this may take a
little bit. In the meantime, if you help us by sending more logs, we're
happy to try diagnosing and fixing things.

Thanks and Regards
Dario Faggioli, Ph.D
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.