[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [ANNOUNCE] Xen 4.15 - call for notification/status of significant bugs
On Thu, Feb 4, 2021 at 9:21 AM Dario Faggioli <dfaggioli@xxxxxxxx> wrote: > > On Thu, 2021-02-04 at 12:12 +0000, Ian Jackson wrote: > > B. "scheduler broken" bugs. > > > > Information from > > Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > > Dario Faggioli <dfaggioli@xxxxxxxx> > > > > Quoting Andrew Cooper > > > We've had 4 or 5 reports of Xen not working, and very little > > > investigation on whats going on. Suspicion is that there might be > > > two bugs, one with smt=0 on recent AMD hardware, and one more > > > general "some workloads cause negative credit" and might or might > > > not be specific to credit2 (debugging feedback differs - also might > > > be 3 underlying issue). > > > > I reviewed a thread about this and it is not clear to me where we are > > with this. > > > Ok, let me try to summarize the current status. > > - BUG: credit=sched2 machine hang when using DRAKVUF > > https://lists.xen.org/archives/html/xen-devel/2020-05/msg01985.html > https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg01561.html > https://bugzilla.opensuse.org/show_bug.cgi?id=1179246 > > 99% sure that it's a Credit2 scheduler issue. > I'm actively working on it. > "Seems a tricky one; I'm still in the analysis phase" > > Manifests only with certain combination of hardware and workload. > I'm not reproducing, but there are multiple reports of it (see > above). I'm investigating and trying to come up at least with > debug patches that one of the reporter should be able and willing to > test. > > - Null scheduler and vwfi native problem > > https://lists.xenproject.org/archives/html/xen-devel/2021-01/msg01634.html > > RCU issues, but manifests due to scheduler behavior (especially > NULL scheduler, especially on ARM). > I'm actively working on it. > > Patches that should solve the issue for ARM posted already. They > will need to be slightly adjusted to cover x86 as well. Waiting a > couple days more for a confirmation from the reporter that the > patches do help, at least on ARM. > I've run into null-scheduler causing CPU lockups as well on x86. Required physical machine reboot. Seems to be triggered with domain destruction when destroying fork vms. Happens only intermittently. Tamas
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |