[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Linux DomU freezes and dies under heavy memory shuffling
On 17.02.21 09:12, Roman Shaposhnik wrote: Hi Jürgen, thanks for taking a look at this. A few comments below: On Tue, Feb 16, 2021 at 10:47 PM Jürgen Groß <jgross@xxxxxxxx> wrote:On 16.02.21 21:34, Stefano Stabellini wrote:+ x86 maintainers It looks like the tlbflush is getting stuck?I have seen this case multiple times on customer systems now, but reproducing it reliably seems to be very hard.It is reliably reproducible under my workload but it take a long time (~3 days of the workload running in the lab). This is by far the best reproduction rate I have seen up to now. The next best reproducer seems to be a huge installation with several hundred hosts and thousands of VMs with about 1 crash each week. I suspected fifo events to be blamed, but just yesterday I've been informed of another case with fifo events disabled in the guest. One common pattern seems to be that up to now I have seen this effect only on systems with Intel Gold cpus. Can it be confirmed to be true in this case, too?I am pretty sure mine isn't -- I can get you full CPU specs if that's useful. Just the output of "grep model /proc/cpuinfo" should be enough. In case anybody has a reproducer (either in a guest or dom0) with a setup where a diagnostic kernel can be used, I'd be _very_ interested!I can easily add things to Dom0 and DomU. Whether that will disrupt the experiment is, of course, another matter. Still please let me know what would be helpful to do. Is there a chance to switch to an upstream kernel in the guest? I'd like to add some diagnostic code to the kernel and creating the patches will be easier this way. Juergen Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc Attachment:
OpenPGP_signature
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |