[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen HPET improvement proposal

>>> Andrew Cooper <andrew.cooper3@xxxxxxxxxx> 10/25/13 2:22 PM >>>
>The root problems with the current situation are twofold.  Xen runs the
>action handler for all IRQs with interrupts enabled and having already
>been acknowledged at the LAPIC.

In a subset of cases. And as pointed out before this early ack-ing is
very likely wrong in the HPET case.

>Independently of the HPET issues themselves, I have identified a race
>condition in the mwait-idle routines where a cpu which is preparing to
>sleep can arrange for another cpu to wake it up, and have that other cpu
>wake it up before it has enabled its mwait trigger, meaning that it will
>idle for an arbitrary length of time in mwait.  Realistically, the cpu
>will be woken up by the time calibration rendezvous once a second, and
>possibly by the watchdog NMI every half second.

Which is an awfully long period of time... Looking forward to see
further details on this.

>For the new mechanism, I propose that HPET interrupts get a
>direct_apic_vector and completely bypass the IRQ mechanism.  This gives
>the HPET interrupts guaranteed higher priority than all guest
>interrupts.  When a cpu wishes to idle, tries to find an HPET.  If there
>is a free HPET, the cpu becomes the owner of the HPET.  It sets the HPET
>up to interrupt itself at some point in the future and goes to sleep.

I agree - it should have been done this way from the beginning.

>If there is not a free HPET, a cpu will need to share with another cpu. 
>If this cpu can find another HPET which will fire at an appropriate
>time, the cpu can merely ask for it to be woken up by the HPET owner
>when the owner wakes up.  If all the HPETs are programmed to fire a
>sufficient time into the future, one needs to be shortened.  The cpu
>should choose the soonest HPET, add itself to the owner's list of other
>pcpus to wake, and reprogram the HPET to fire sooner.  It should not
>reprogram the HPET to point to itself.

I think blindly looking for the one with the closest wakeup is not ideal:
For one, on huge systems this requires you to scan through too many
other CPUs. And taking NUMA aspects into consideration here would
seem at the very least desirable too (i.e. prefer sharing with a CPU
close to the one looking for a "partner").

>The final requirement makes it far far easier to validate the
>correctness of the correctness of the fix, and in particular that
>interrupts are arriving at the expected cpu.  Given a validated solution
>proved to work, it might be possible to relax the requirement, so long
>as a reasonable solution to waking up the original owner is found (and I
>can't offhand think of a neat way of doing this, as ownership could move
>around arbitrarily).

Perhaps if this ownership change would be restricted (in that only the
owner itself can transfer [or give up] ownership), there wouldn't be
much of a problem: Since it's always the owner that gets woken, it would
simply need to re-establish a suitable new timeout. Once it's the owner's
turn, it would transfer ownership (or mark the channel unowned). Since
the interrupts wouldn't be subject to (normal) IRQ migration anymore,
there shouldn't then also be any false wakeups (which otherwise could
introduce races).


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.