[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/2] xen: credit2: limit the number of CPUs per runqueue


I felt like providing some additional thoughts about this series, from
a release point of view (adding Paul).

Timing is *beyond tight* so if this series, entirely or partly, has any
chance to go in, it would be through some form of exception, which of
course comes with some risks, etc.

I did work hard to submit the full series, because I wanted people to
be able to see the complete solution. However, I think the series
itself can be logically split in two parts.

Basically, if we just consider patches 1 and 4 we will end up, right
after boot, with a system that has smaller runqueues. They will most
likely be balanced in terms of how many CPUs each one has, so a good
setup. This will likely (actual differences seems to depend *quite a
bit* on the actual workload) be an improvement for very large systems.

This is a path will get a decent amount of testing in OSSTests, from
now until the day of the release, I think, because booting with the
default CPU configuration and setup is what most (all?) OSSTests' jobs

If the user starts to create pools, we can get to a point where the
different runqueues are unbalanced, i.e., each one has a different
number of CPUs in them, wrt the others. This, however:
* can happen already, as of today, without this patches. Whether these
  patches may make things "better" or "worse", from this point of view,
  it's impossible to tell, because it actually depends on what CPUs 
  the user moves among pools or put offline, etc.
* this means that the scheduler has to deal with unbalanced runqueues 
  anyway, and if it doesn't, it's a bug and, again, it seems to me 
  that these patches don't make things particularly better or worse.

So, again, for patches 1 and 4 alone, it looks to me that we'd get
improvements, at least in some cases, the codepath will get some
testing and they do not introduce additional or different issues than
what we have already right now.

They also are at their second iteration, as the original patch series
was comprised of exactly those two patches.

So, I think it would be interesting if these two patches would be given
a chance, even just of getting some reviews... And I would be fine
going through the formal process necessary for making that to happen

Then, there's the rest, the runqueue rebalancing thing. As said above,
I personally would love if we'd release with it, but I see one rather
big issue. In fact, such mechanism is triggered and stressed is
stressed by the dynamic creation and manipulation of cpupools (and CPU
on/off-lining), and we don't have an OSSTests test for that. Therefore,
we are not in the best position for catching issues it may have

I can commit to do some testing myself, but it's not the same thing has
having them in our CI, I know that very well. So, I'd be interested in
hearing what others think about these other patches as well, and I am
happy to do my best to make sure that they are working fine, if we
decide to try to include them too, but I do see this as much more of a
risk myself.

So, any thoughts? :-)

Thanks and Regards
Dario Faggioli, Ph.D
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.