Re: [Xen-devel] [PATCH] Yield to VCPU hcall, spinlock yielding
habanero@xxxxxxxxxxxxxxxxxxxxxxx wrote on 06/08/2005
> > In our original posting, we proposed that the Linux interrupt
> > for preemption notifications would create (or unblock) a
> > high-priority kernel thread which would then yield back to the
> > hypervisor. To Linux on other CPUs, the de-scheduled CPU
> > appear to be busy running the high-priority thread, and all real
> > that that CPU had been doing would be eligible for stealing.
> IMO, I don't think this alone is enough to encourage task migration.
> The primary motivator to steal is a 25% or more load imbalance, and
> extra fake kernel thread will probably not be enough to trigger this.
The kernel thread is needed at the very
least to ensure that all user programs on the de-scheduled CPU are available
for migration. In an important case, a program on the de-scheduled
CPU holds a futex, and another CPU goes idle because its program blocks
on the futex. We'd want the idle CPU to pick up the futex holder,
and I'm assuming (with very little actual knowledge) that the Linux scheduler
would make that happen.
> To solve this and other issues, I believe we need an extra modifier
> the Linux kernel cpus' load value, which Xen could modify to hint
> kernel what cpus' relative processing power is. The Linux kernel
> scheduler's per cpu load values would be something like (max_cpu_power
> / cpu_power * nr_running). Xen could update cpu_power for a
> situations, a "long" preemption, a much faster alternative
to a vcpu
> hot-unplug (don't unplug, just set cpu_power to 0), and to normalize
> load values for vcpus which have different time-slice lengths on the
> physical cpus.
> I would hope something like this could also be used without Xen on
> so it has wider appeal. One thing that comes to mind is normalizing
> cpus' load when some cpus may be "speed stepped" down for
> management or heat issues.
I'd view your "cpu_power"
proposal as orthogonal to (or perhaps complementary to) our ideas on preemption
notification. It's aimed more at load-balancing and fair scheduling
than specifically at the problems that arise with the preemption of lock
holders. On the apparent CPU speed issue, does Linux account in any
way for different interrupt loads on different processors? Is a program
just out of luck if it happens to get scheduled on a processor with heavy
interrupt traffic, or will Linux notice that it's not making the same progress
as its peers and shuffle things around? It seems that your cpu_power
proposal might have something to contribute here.
Xen-devel mailing list