[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] null scheduler bug



On Thu, 2018-09-27 at 15:15 +0200, Milan Boberic wrote:
> Hi,
> I applied patch and vwfi=native and everything works fine, I can
> create and destroy guest domain as many times as I want.
> 
> I have to ask, will this patch have any impact on performance (I will
> test it later, but I just need your opinions)?
>
Well, with a question like this, the only possible answer is "depends".
:-)

Basically, there is a little bit of overhead to be expected, with this
patch applied, every time that call_rcu() is invoked, inside Xen. Now,
you can look at when that happens, and you'll notice that this
basically never happen in an hot-path.

In your case, there is at least one call in the domain destruction
path. You can try to measure whether actually destroying the domain
takes more time _with_ "wfi=native" (plus this patch) as compared to
how long it takes _without_ "wfi=native" (and also without this patch).
I don't think you'll be able to appreciate any significant difference.

The point is more, I think, whether "wfi=native" helps your use case.
Have you measure that? I mean, have you checked what is the difference
in performance (or latency, or whatever you're interested in) between
the "wfi=native" case and the default?
If you have, and "wfi=native" helps, then you also need something like
this patch, or domain destruction won't work (in fact, I call the fact
that it takes 'around 7 seconds', not working). If "wfi=native" does
not help your use case, then you're better off not using neither it nor
this patch.

> And what this patch exactly do? I need to fully understand it because
> I need to document it in my master thesis which will be finished soon
> thanks to you people :D
> 
Have you heard about RCU? It's a very clever synchronization solution,
widely used in the Linux kernel. Xen has that too, but we use an old
version of the Linux code, and we don't use it that much.

This is, IMO, some good introductory material, but, really, just google
"RCU" or "RCU linux", and you'll hit tons of articles and docs:
https://lwn.net/Articles/262464/

Well, our implementation of RCU requires that, from time to time, the
various physical CPUs of your box become idle, or get an interrupt, or
go executing inside Xen (for hypercalls, vmexits, etc). In fact, a CPU
going through Xen is what allow us to tell that it reached a so-called
'quiescent state', which in turns is necessary for declaring a so-
called 'RCU grace period' over.

Usually, as soon as a guest (or dom0) vCPU become idle, the pCPU on
which it was running does go through Xen, to figure out whether or not
there is another vCPU, from the same or from another guest, to be run.
If not, the pCPU stays idle, but it stays idle _in_Xen_, and that is
good for RCU quiescence and grace period tracking.

Now, with the combination of "sched=null" and "wfi=native", when the
guest (or dom0) vCPU becomes idle, we _stay_in_the_guest_, until
something (typically an interrupt) comes. This means that the vCPU in
question never let Xen's RCU know that he has gone through a quiescent
state, and grace periods risk lasting very long, if not forever.

In fact, the reason why everything was working again with a printk()
was, as Julien noted, that an interrupt was being injected. Check the
old discussion on xen-devel about the RCU bug that I linked to in one
of my first messages in this thread to even more insights.

https://www.mail-archive.com/xen-devel@xxxxxxxxxxxxx/msg105388.html

https://lists.xenproject.org/archives/html/xen-devel/2017-07/msg02770.html

https://lists.xen.org/archives/html/xen-devel/2017-09/msg01855.html
https://lists.xen.org/archives/html/xen-devel/2017-09/msg03515.html
https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg01855.html

Setting the qhimark, qlowmark and blimit to the values you see in the
patch, partially defeats the purpose of RCU, as the update of the data
structure is not deferred to some future point in time, but it is
basically always performed synchronously with the modification, and
that's why I dislike just doing it all the time, and I prefer limiting
to doing it when we're using "wfi=native".

For some more details about the meaning of the qhimark, qlowmark and
blimit values, check these:
https://www.systutorials.com/linux-kernels/132439/patch-rcu-batch-tuning-linux-2-6-16/
https://lwn.net/Articles/166647/

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.