[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null scheduler and vwfi native problem



On Fri, 2021-01-22 at 14:26 +0000, Julien Grall wrote:
> Hi Anders,
> 
> On 22/01/2021 08:06, Anders Törnqvist wrote:
> > On 1/22/21 12:35 AM, Dario Faggioli wrote:
> > > On Thu, 2021-01-21 at 19:40 +0000, Julien Grall wrote:
> > - booting with "sched=null vwfi=native" but not doing the IRQ 
> > passthrough that you mentioned above
> > "xl destroy" gives
> > (XEN) End of domain_destroy function
> > 
> > Then a "xl create" says nothing but the domain has not started
> > correct. 
> > "xl list" look like this for the domain:
> > mydomu                                   2   512     1 ------      
> > 0.0
> 
> This is odd. I would have expected ``xl create`` to fail if something
> went wrong with the domain creation.
>
So, Anders, would it be possible to issue a:

# xl debug-keys r
# xl dmesg

And send it to us ?

Ideally, you'd do it:
 - with Julien's patch (the one he sent the other day, and that you 
   have already given a try to) applied
 - while you are in the state above, i.e., after having tried to 
   destroy a domain and failing
 - and maybe again after having tried to start a new domain

> One possibility is the NULL scheduler doesn't release the pCPUs until
> the domain is fully destroyed. So if there is no pCPU free, it
> wouldn't 
> be able to schedule the new domain.
> 
> However, I would have expected the NULL scheduler to refuse the
> domain 
> to create if there is no pCPU available.
> 
Yeah but, unfortunately, the scheduler does not have it easy to fail
domain creation at this stage (i.e., when we realize there are no
available pCPUs). That's the reason why the NULL scheduler has a
waitqueue, where vCPUs that cannot be put on any pCPU are put.

Of course, this is a configuration error (or a bug, like maybe in this
case :-/), and we print warnings when it happens.

> @Dario, @Stefano, do you know when the NULL scheduler decides to 
> allocate the pCPU?
> 
On which pCPU to allocate a vCPU is decided in null_unit_insert(),
called from sched_alloc_unit() and sched_init_vcpu().

On the other hand, a vCPU is properly removed from its pCPU, hence
making the pCPU free for being assigned to some other vCPU, in
unit_deassign(), called from null_unit_remove(), which in turn is
called from sched_destroy_vcpu() Which is indeed called from
complete_domain_destroy().

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.