|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.19] xen/x86: limit interrupt movement done by fixup_irqs()
On Thu, May 16, 2024 at 05:00:54PM +0200, Jan Beulich wrote:
> On 16.05.2024 15:22, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -2527,7 +2527,7 @@ static int __init cf_check setup_dump_irqs(void)
> > }
> > __initcall(setup_dump_irqs);
> >
> > -/* Reset irq affinities to match the given CPU mask. */
> > +/* Evacuate interrupts assigned to CPUs not present in the input CPU mask.
> > */
> > void fixup_irqs(const cpumask_t *mask, bool verbose)
> > {
>
> Evacuating is one purpose. Updating affinity, if need be, is another. I've
> been wondering more than once though whether it is actually correct /
> necessary for ->affinity to be updated by the function. As it stands you
> don't remove the respective code, though.
Yeah, I didn't want to get into updating ->affinity in this patch, so
decided to leave that as-is.
Note however that if we shuffle the interrupt around we should update
->affinity, so that the new destination is part of ->affinity?
Otherwise we could end up with the interrupt assigned to CPU(s) that
are not part of the ->affinity mask. Maybe that's OK, TBH I'm not
sure I understand the purpose of the ->affinity mask, hence why I've
decided to leave it alone in this patch.
>
> > @@ -2576,7 +2576,12 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> > release_old_vec(desc);
> > }
> >
> > - if ( !desc->action || cpumask_subset(desc->affinity, mask) )
> > + /*
> > + * Avoid shuffling the interrupt around if it's assigned to a CPU
> > set
> > + * that's all covered by the requested affinity mask.
> > + */
> > + cpumask_and(affinity, desc->arch.cpu_mask, &cpu_online_map);
> > + if ( !desc->action || cpumask_subset(affinity, mask) )
> > {
> > spin_unlock(&desc->lock);
> > continue;
>
> First my understanding of how the two CPU sets are used: ->affinity is
> merely a representation of where the IRQ is permitted to be handled.
> ->arch.cpu_mask describes all CPUs where the assigned vector is valid
> (and would thus need cleaning up when a new vector is assigned). Neither
> of the two needs to be a strict subset of the other.
Oh, so it's allowed to have the interrupt target a CPU
(->arch.cpu_mask) that's not set in the affinity mask?
>
> (It's not really clear whether ->arch.cpu_mask is [supposed to be] a
> subset of cpu_online_map.)
Not according to the description in arch_irq_desc:
/*
* Except for high priority interrupts @cpu_mask may have bits set for
* offline CPUs. Consumers need to be careful to mask this down to
* online ones as necessary. There is supposed to always be a non-
* empty intersection with cpu_online_map.
*/
So ->arch.cpu_mask can (according to the comment) not strictly be a
subset of cpu_online_map.
Note this is only possible when using logical destination mode, so
removing that would turn the destination field into an unsigned int
that would point to a single CPU that must be present in
cpu_online_map.
> If that understanding of mine is correct, going from just ->arch.cpu_mask
> doesn't look quite right to me, as that may include CPUs not in ->affinity.
> As in: Things could be further limited, by also ANDing in ->affinity.
Hm, my expectation would be that ->arch.cpu_mask is a subset of
->affinity, but even if it's not, what we do care in fixup_cpus() is
what CPUs the interrupt targets, as we need to move the interrupt if
the target set is not in the input mask set. I don't think ->affinity
should be taken into account for that decision, it should be done
based exclusively on which CPU(s) the interrupt target
(->arch.cpu_mask).
> At the same time your(?) and my variant suffer from cpumask_subset()
> possibly returning true despite the CPU the IRQ is presently being
> handled on being the one that we're in the process of bringing down.
No, I'm not sure I see how that can happen. The CPU we are bringing
down will always be clear from the input CPU mask, and hence
cpumask_subset(->arch.cpu_mask, mask) will only return true if all the
set CPUs in ->arch.cpu_mask are also set in mask. IOW: none of the
possible target destinations is a CPU to be removed.
> In
> that case we absolutely cannot skip the move. (I'd like to note that
> there are only two possible input values of "mask" for the function. The
> case I think you've been looking into is when it's &cpu_online_map.
Well, it's cpu_online_map which already has the CPU to be offlined
cleared. See the call to cpumask_clear_cpu() ahead of calling
fixup_irqs().
> In
> which case cpumask_subset() is going to always return true with your
> change in place, if I'm not mistaken. That seems to make your change
> questionable. Yet with that I guess I'm overlooking something.)
I might we wrong, but I think you are missing that the to be offlined
CPU has been removed from cpu_online_map by the time it gets passed
to fixup_irqs().
> Plus there remains the question of whether updating ->affinity can indeed
> be skipped in more cases than it is right now (prior to you patch), or
> whether up front we may want to get rid of that updating (in line, I think,
> with earlier changes we did elsewhere) altogether.
I'm not sure about ->affinity TBH, that's why I've opted to leave it
alone for the time being. It doesn't seem like a very helpful field
right now.
I would expect it to be mostly static, and populated with the set of
CPUs that are closer to (NUMA-wise) the device the interrupt belongs
to, but I'm not seeing any of this. It seems to be set arbitrarily to
the CPU that binds the interrupt.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |