[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86: fix ordering of operations in destroy_irq()

On 29/05/2013 08:23, Jan Beulich wrote:
>>>> On 29.05.13 at 08:58, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>> The fix for XSA-36, switching the default of vector map management to
>> be per-device, exposed more readily a problem with the cleanup of these
>> vector maps: dynamic_irq_cleanup() clearing desc->arch.used_vectors
>> keeps the subsequently invoked clear_irq_vector() from clearing the
>> bits for both the in-use and a possibly still outstanding old vector.
> This resulted from an Andrew bringing the issue up on security@,
> and there are more problems here which partly he pointed out and
> partly we came to realize during the discussion there.
> One issue is that the vector map management is not really taking
> effect for devices using multiple interrupts (at present MSI-X only)
> because the initial vector assignment in create_irq() can't possibly
> take into account the vector map, as that can be (and is being, in
> map_domain_pirq()) set only after that function returns.
> Andrew suggests to immediately re-assign the vector in
> map_domain_pirq(), but that would be cumbersome as a "normal"
> re-assignment only takes effect the next time an IRQ actually
> occurs. I therefore think that we should rather suppress the initial
> assignment of a vector, deferring it until after the vector map got
> set.
> We figured that this is no security relevant as the respective
> assertion is a debug build only thing (and debug build only issues
> were decided to not be handled via issuing advisories some time
> ago), and the effect this guards against (improperly set up IRTEs
> on AMD IOMMU) is only affecting the guest itself (using - until the
> multi-vector MSI series goes in - solely vector based indexing): A
> second CPU getting the same vector allocated as is in use for
> another IRQ of the same device on another CPU, this would simply
> break the IRQs raised to that guest.
>> Once at it, also defer resetting of desc->handler until after the loop
>> around smp_mb() checking for IRQ_INPROGRESS to be clear, fixing a
>> (mostly theoretical) issue with the intercation with do_IRQ(): If we
>> don't defer the pointer reset, do_IRQ() could, for non-guest IRQs, call
>> ->ack() and ->end() with different ->handler pointers, potentially
>> leading to an IRQ remaining un-acked. The issue is mostly theoretical
>> because non-guest IRQs are subject to destroy_irq() only on (boot time)
>> error paths.
> While this change also reduces the chances of an IRQ getting raised
> along with destroy_irq() getting invoked and, due to do_IRQ() losing
> the race for desc->lock, calling ack_bad_irq() due to ->handler
> already having got set to &no_irq_type, it doesn't entirely close the
> window. While the issue is cosmetic only (the IRQ raised here has no
> meaning anymore as it ie being torn down, and hence the message
> issued from ack_bad_irq() is merely a false positive), I think that this
> could be fixed by transitioning desc->arch.used to IRQ_RESERVED
> (instead of IRQ_UNUSED) in __clear_irq_vector(), and deferring both
> the resetting of desc->handler and the setting of desc->arch.used to
> IRQ_UNUSED to an RCU callback.
>> As to the changed locking: Invoking clear_irq_vector() with desc->lock
>> held is okay because vector_lock already nests inside desc->lock (proven
>> by set_desc_affinity(), which takes vector_lock and gets called from
>> various desc->handler->ack implementations, getting invoked with
>> desc->lock held).
> The locking around vector management appears to be a wider issue:
> Changes to IRQ descriptors generally are to be protected by holding
> desc->lock, yet all of the vector management is only guarded by
> vector_lock. The patch here addresses this for destroy_irq()'s
> invocation of clear_irq_vector(), but this apparently would need
> taking care of on all other affected code paths.
> Andrew - please double check I didn't forget any of the aspects we
> discussed yesterday.
> Jan

I believe that covers the issues we discovered.  I will be back in the
office properly on Friday and can see about addressing the issues not
addressed so far in this patch.

As for the patch itself,
Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

As for the further points.

I am not sure the RCU stuff is sufficient.  There can be an unknown
number of irqs in transit somewhere in the hardware between a
destroy_irq() running ->shutdown() and desc->handler being replaced with

A grace time long enough to be fairly sure that any irqs-in-transit are
delivered will remove the chance of false-positives in ack_none(), at
the risk of some false negatives slipping by.  I suppose another trick
with -irq might be able to be used to distinguish between a late
irq-in-transit and a real spurious irq.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.