[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null scheduler and vwfi native problem



Thanks for the responses.

On 1/22/21 12:35 AM, Dario Faggioli wrote:
On Thu, 2021-01-21 at 19:40 +0000, Julien Grall wrote:
Hi Dario,

Hi!

On 21/01/2021 18:32, Dario Faggioli wrote:
On Thu, 2021-01-21 at 11:54 +0100, Anders Törnqvist wrote:
https://lists.xenproject.org/archives/html/xen-devel/2018-09/msg01213.html
.

Right. Back then, PCI passthrough was involved, if I remember
correctly. Is it the case for you as well?
PCI passthrough is not yet supported on Arm :). However, the bug was
reported with platform device passthrough.

Yeah, well... That! Which indeed is not PCI. Sorry for the terminology
mismatch. :-)

Well, I'll think about it. >
Starting the system without "sched=null vwfi=native" does not
result
in
the problem.

Ok, how about, if you're up for some more testing:

   - booting with "sched=null" but not with "vwfi=native"
   - booting with "sched=null vwfi=native" but not doing the IRQ
     passthrough that you mentioned above

?
I think we can skip the testing as the bug was fully diagnostics back
then. Unfortunately, I don't think a patch was ever posted.

True. But an hackish debug patch was provided and, back then, it
worked.

OTOH, Anders seems to be reporting that such a patch did not work here.
I also continue to think that we're facing the same or a very similar
problem... But I'm curious why applying the patch did not help this
time. And that's why I asked for more testing.
I made the tests as suggested to shed some more light if needed.

- booting with "sched=null" but not with "vwfi=native"
Without "vwfi=native" it works fine to destroy and to re-create the domain.
Both printouts comes after a destroy:
(XEN) End of domain_destroy function
(XEN) End of complete_domain_destroy function


- booting with "sched=null vwfi=native" but not doing the IRQ passthrough that you mentioned above
"xl destroy" gives
(XEN) End of domain_destroy function

Then a "xl create" says nothing but the domain has not started correct. "xl list" look like this for the domain:
mydomu                                   2   512     1 ------       0.0


Anyway, it's true that we left the issue pending, so something like
this:

  From Xen PoV, any pCPU executing guest context can be considered
quiescent. So one way to solve the problem would be to mark the pCPU
when entering to the guest.

Should be done anyway.

We'll then see if it actually solves this problem too, or if this is
really something else.

Thanks for the summary, BTW. :-)

I'll try to work on a patch.
Thanks, just let me know if I can do some testing to assist.

Regards

[1]
https://lore.kernel.org/xen-devel/acbeae1c-fda1-a079-322a-786d7528ecfc@xxxxxxx/




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.