|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 for-4.21 2/9] x86/HPET: use single, global, low-priority vector for broadcast IRQ
On Thu, Oct 23, 2025 at 12:37:22PM +0200, Jan Beulich wrote:
> On 23.10.2025 10:39, Roger Pau Monné wrote:
> > On Wed, Oct 22, 2025 at 11:21:15AM +0200, Jan Beulich wrote:
> >> On 21.10.2025 15:49, Roger Pau Monné wrote:
> >>> On Tue, Oct 21, 2025 at 08:42:13AM +0200, Jan Beulich wrote:
> >>>> On 20.10.2025 18:22, Roger Pau Monné wrote:
> >>>>> On Mon, Oct 20, 2025 at 01:18:34PM +0200, Jan Beulich wrote:
> >>>>>> @@ -476,19 +486,50 @@ static struct hpet_event_channel *hpet_g
> >>>>>> static void set_channel_irq_affinity(struct hpet_event_channel *ch)
> >>>>>> {
> >>>>>> struct irq_desc *desc = irq_to_desc(ch->msi.irq);
> >>>>>> + struct msi_msg msg = ch->msi.msg;
> >>>>>>
> >>>>>> ASSERT(!local_irq_is_enabled());
> >>>>>> spin_lock(&desc->lock);
> >>>>>> - hpet_msi_mask(desc);
> >>>>>> - hpet_msi_set_affinity(desc, cpumask_of(ch->cpu));
> >>>>>> - hpet_msi_unmask(desc);
> >>>>>> +
> >>>>>> + per_cpu(vector_irq, ch->cpu)[HPET_BROADCAST_VECTOR] = ch->msi.irq;
> >>>>>> +
> >>>>>> + /*
> >>>>>> + * Open-coding a reduced form of hpet_msi_set_affinity() here.
> >>>>>> With the
> >>>>>> + * actual update below (either of the IRTE or of [just] message
> >>>>>> address;
> >>>>>> + * with interrupt remapping message address/data don't change)
> >>>>>> now being
> >>>>>> + * atomic, we can avoid masking the IRQ around the update. As a
> >>>>>> result
> >>>>>> + * we're no longer at risk of missing IRQs (provided
> >>>>>> hpet_broadcast_enter()
> >>>>>> + * keeps setting the new deadline only afterwards).
> >>>>>> + */
> >>>>>> + cpumask_copy(desc->arch.cpu_mask, cpumask_of(ch->cpu));
> >>>>>> +
> >>>>>> spin_unlock(&desc->lock);
> >>>>>>
> >>>>>> - spin_unlock(&ch->lock);
> >>>>>> + msg.dest32 = cpu_physical_id(ch->cpu);
> >>>>>> + msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> >>>>>> + msg.address_lo |= MSI_ADDR_DEST_ID(msg.dest32);
> >>>>>> + if ( msg.dest32 != ch->msi.msg.dest32 )
> >>>>>> + {
> >>>>>> + ch->msi.msg = msg;
> >>>>>> +
> >>>>>> + if ( iommu_intremap != iommu_intremap_off )
> >>>>>> + {
> >>>>>> + int rc = iommu_update_ire_from_msi(&ch->msi, &msg);
> >>>>>>
> >>>>>> - /* We may have missed an interrupt due to the temporary masking.
> >>>>>> */
> >>>>>> - if ( ch->event_handler && ch->next_event < NOW() )
> >>>>>> - ch->event_handler(ch);
> >>>>>> + ASSERT(rc <= 0);
> >>>>>> + if ( rc > 0 )
> >>>>>> + {
> >>>>>> + ASSERT(msg.data ==
> >>>>>> hpet_read32(HPET_Tn_ROUTE(ch->idx)));
> >>>>>> + ASSERT(msg.address_lo ==
> >>>>>> + hpet_read32(HPET_Tn_ROUTE(ch->idx) + 4));
> >>>>>> + }
> >>>>>
> >>>>> The sequence of asserts seem wrong here, the asserts inside of the rc
> >>>>>> 0 check will never trigger, because there's an ASSERT(rc <= 0)
> >>>>> ahead of them?
> >>>>
> >>>> Hmm. My way of thinking was that if we get back 1 (which we shouldn't),
> >>>> we ought to check (and presumably fail on) data or address having
> >>>> changed.
> >>>
> >>> Right, but the ASSERT(rc <= 0) will prevent reaching any of the
> >>> followup ASSERTs if rc == 1?
> >>
> >> Which is no problem, as we'd be dead already anyway if the first assertion
> >> triggered. Nevertheless I've switched the if() to >= 0 (which then pointed
> >> out a necessary change in AMD IOMMU code).
> >
> > Right, so and adjusted if condition plus an ASSERT_UNREACHABLE() at
> > the end of the if code block?
>
> That is, instead of
>
> ASSERT(rc <= 0);
> if ( rc >= 0 )
> {
> ASSERT(msg.data == hpet_read32(HPET_Tn_ROUTE(ch->idx)));
> ASSERT(msg.address_lo ==
> hpet_read32(HPET_Tn_ROUTE(ch->idx) + 4));
> }
>
> you'd prefer
>
> if ( rc >= 0 )
> {
> ASSERT(msg.data == hpet_read32(HPET_Tn_ROUTE(ch->idx)));
> ASSERT(msg.address_lo ==
> hpet_read32(HPET_Tn_ROUTE(ch->idx) + 4));
> ASSERT_UNREACHABLE();
> }
>
> ? That's wrong though (for rc == 0), i.e. I fear I don't see what you mean.
Oh, I see, sorry for the suggestions, it's indeed wrong. FTAOD, what
do you plan to use then here?
You could replace the ASSERT_UNREACHABLE() for ASSERT(rc == 0) in my
suggestion I think?
Or maybe just do:
ASSERT(rc <= 0);
if ( !rc )
{
ASSERT(msg.data == hpet_read32(HPET_Tn_ROUTE(ch->idx)));
ASSERT(msg.address_lo ==
hpet_read32(HPET_Tn_ROUTE(ch->idx) + 4));
}
Was your original intention with those checks to ensure that for the
rc == 0 case the message fields remained unchanged?
> >>> IOW, we possibly want:
> >>>
> >>> if ( rc > 0 )
> >>> {
> >>> dprintk(XENLOG_ERR,
> >>> "Unexpected HPET MSI setup returned: data: %#x
> >>> address: %#lx expected data %#x address %#lx\n",
> >>> msg.data, msg.address,
> >>> ch->msi.msg.data, ch->msi.msg.address);
> >>> ASSERT_UNREACHABLE();
> >>> hpet_msi_mask(desc);
> >>> hpet_write32(msg.data, HPET_Tn_ROUTE(ch->idx));
> >>> hpet_write32(msg.address_lo, HPET_Tn_ROUTE(ch->idx) + 4);
> >>> hpet_msi_unmask(desc);
> >>> }
> >>> ASSERT(!rc);
> >>
> >> To be honest, for my taste this goes too far as to what follows an
> >> ASSERT_UNREACHABLE().
> >
> > I can understand that. It's the best way I've come up with attempting
> > to recover from a possible error in the release case, but I don't
> > particularly like it either.
> >
> >>> I'm unsure about attempting to propagate the returned values on release
> >>> builds, I guess it's slightly better than possibly using an outdated
> >>> RTE entry? Albeit this should never happen.
> >>
> >> Yes to the last remark; I don't actually see what you would want to do
> >> with the propagated return value.
> >
> > OK, I can this this not being clear. By propagate here I mean
> > propagate to the hardware registers, not to the function caller.
>
> I.e. you still think adding the two hpet_write32() is going to be useful?
> The mask/unmask, as I did say in another reply to your comments, isn't
> useful here anyway (for already not being atomic), so I wouldn't see much
> sense in having them.
Right, for it to be correct the masking would need to include the
iommu_update_ire_from_msi() call also.
> Plus of course we'd want to avoid the writes on
> release builds if the values actually match, i.e. the construct would then
> rather end up as two if-mismatch-then-write-else-assert-unreachable ones.
My concern would be that after this change we won't cope anymore with
iommu_update_ire_from_msi() returning a value > 1. Which might be
fine, as it's in theory not possible, but seems less robust than it
was before the change. I guess it's the price to pay for avoiding the
masking (unless you have other options).
Looking at the existing code is likely no worse than when
iommu_update_ire_from_msi() returning an error, and that case is
already silently ignored by hpet_msi_set_affinity(). So I think
silently ignoring > 0 is not that different, and doesn't make the
current handling much worse. It would be nice to handle those better,
but can be done separately.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |