[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v1b 1/9] x86/IRQ: deal with move-in-progress state in fixup_irqs()



On Mon, Apr 29, 2019 at 09:40:14AM -0600, Jan Beulich wrote:
> The flag being set may prevent affinity changes, as these often imply
> assignment of a new vector. When there's no possible destination left
> for the IRQ, the clearing of the flag needs to happen right from
> fixup_irqs().
> 
> Additionally _assign_irq_vector() needs to avoid setting the flag when
> there's no online CPU left in what gets put into ->arch.old_cpu_mask.
> The old vector can be released right away in this case.
> 
> Also extend the log message about broken affinity to include the new
> affinity as well, allowing to notice issues with affinity changes not
> actually having taken place. Swap the if/else-if order there at the
> same time to reduce the amount of conditions checked.
> 
> At the same time replace two open coded instances of the new helper
> function.
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> v2: Also update vector_irq[] in the code added to fixup_irqs().
> 
> --- unstable.orig/xen/arch/x86/irq.c  2019-04-29 17:34:16.726542659 +0200
> +++ unstable/xen/arch/x86/irq.c       2019-04-29 15:05:39.000000000 +0200
> @@ -242,6 +242,20 @@ void destroy_irq(unsigned int irq)
>      xfree(action);
>  }
>  
> +static void release_old_vec(struct irq_desc *desc)
> +{
> +    unsigned int vector = desc->arch.old_vector;
> +
> +    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> +    cpumask_clear(desc->arch.old_cpu_mask);
> +
> +    if ( desc->arch.used_vectors )

Wouldn't it be better to clean the bitmap when vector !=
IRQ_VECTOR_UNASSIGNED?

I haven't checked all the callers, but I don't think it's valid to
call release_old_vec with desc->arch.old_vector ==
IRQ_VECTOR_UNASSIGNED, in which case I would add an ASSERT.

> +    {
> +        ASSERT(test_bit(vector, desc->arch.used_vectors));
> +        clear_bit(vector, desc->arch.used_vectors);
> +    }
> +}
> +
>  static void __clear_irq_vector(int irq)
>  {
>      int cpu, vector, old_vector;
> @@ -285,14 +299,7 @@ static void __clear_irq_vector(int irq)

Kind of unrelated, but I think the check at the top of
__clear_irq_vector should be:

BUG_ON(desc->arch.vector == IRQ_VECTOR_UNASSIGNED);

Rather than the current:

BUG_ON(!desc->arch.vector);

There's a lot of logic that would go extremely wrong if vector is -1.

>          per_cpu(vector_irq, cpu)[old_vector] = ~irq;
>      }
>  
> -    desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -    cpumask_clear(desc->arch.old_cpu_mask);
> -
> -    if ( desc->arch.used_vectors )
> -    {
> -        ASSERT(test_bit(old_vector, desc->arch.used_vectors));
> -        clear_bit(old_vector, desc->arch.used_vectors);
> -    }
> +    release_old_vec(desc);
>  
>      desc->arch.move_in_progress = 0;

While there it might be nice to convert move_in_progress to a boolean.

>  }
> @@ -517,12 +524,21 @@ next:
>          /* Found one! */
>          current_vector = vector;
>          current_offset = offset;
> -        if (old_vector > 0) {
> -            desc->arch.move_in_progress = 1;
> -            cpumask_copy(desc->arch.old_cpu_mask, desc->arch.cpu_mask);
> +
> +        if ( old_vector > 0 )
> +        {
> +            cpumask_and(desc->arch.old_cpu_mask, desc->arch.cpu_mask,
> +                        &cpu_online_map);
>              desc->arch.old_vector = desc->arch.vector;
> +            if ( !cpumask_empty(desc->arch.old_cpu_mask) )
> +                desc->arch.move_in_progress = 1;
> +            else
> +                /* This can happen while offlining a CPU. */
> +                release_old_vec(desc);
>          }
> +
>          trace_irq_mask(TRC_HW_IRQ_ASSIGN_VECTOR, irq, vector, &tmp_mask);
> +
>          for_each_cpu(new_cpu, &tmp_mask)
>              per_cpu(vector_irq, new_cpu)[vector] = irq;
>          desc->arch.vector = vector;
> @@ -691,14 +707,8 @@ void irq_move_cleanup_interrupt(struct c
>  
>          if ( desc->arch.move_cleanup_count == 0 )
>          {
> -            desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED;
> -            cpumask_clear(desc->arch.old_cpu_mask);
> -
> -            if ( desc->arch.used_vectors )
> -            {
> -                ASSERT(test_bit(vector, desc->arch.used_vectors));
> -                clear_bit(vector, desc->arch.used_vectors);
> -            }
> +            ASSERT(vector == desc->arch.old_vector);
> +            release_old_vec(desc);
>          }
>  unlock:
>          spin_unlock(&desc->lock);
> @@ -2391,6 +2401,33 @@ void fixup_irqs(const cpumask_t *mask, b
>              continue;
>          }
>  
> +        /*
> +         * In order for the affinity adjustment below to be successful, we
> +         * need __assign_irq_vector() to succeed. This in particular means
> +         * clearing desc->arch.move_in_progress if this would otherwise
> +         * prevent the function from succeeding. Since there's no way for the
> +         * flag to get cleared anymore when there's no possible destination
> +         * left (the only possibility then would be the IRQs enabled window
> +         * after this loop), there's then also no race with us doing it here.
> +         *
> +         * Therefore the logic here and there need to remain in sync.
> +         */
> +        if ( desc->arch.move_in_progress &&
> +             !cpumask_intersects(mask, desc->arch.cpu_mask) )
> +        {
> +            unsigned int cpu;
> +
> +            cpumask_and(&affinity, desc->arch.old_cpu_mask, &cpu_online_map);
> +
> +            spin_lock(&vector_lock);
> +            for_each_cpu(cpu, &affinity)
> +                per_cpu(vector_irq, cpu)[desc->arch.old_vector] = ~irq;
> +            spin_unlock(&vector_lock);
> +
> +            release_old_vec(desc);
> +            desc->arch.move_in_progress = 0;
> +        }
> +
>          cpumask_and(&affinity, &affinity, mask);
>          if ( cpumask_empty(&affinity) )
>          {
> @@ -2409,15 +2446,18 @@ void fixup_irqs(const cpumask_t *mask, b
>          if ( desc->handler->enable )
>              desc->handler->enable(desc);
>  
> +        cpumask_copy(&affinity, desc->affinity);
> +
>          spin_unlock(&desc->lock);
>  
>          if ( !verbose )
>              continue;
>  
> -        if ( break_affinity && set_affinity )
> -            printk("Broke affinity for irq %i\n", irq);
> -        else if ( !set_affinity )
> -            printk("Cannot set affinity for irq %i\n", irq);
> +        if ( !set_affinity )
> +            printk("Cannot set affinity for IRQ%u\n", irq);
> +        else if ( break_affinity )
> +            printk("Broke affinity for IRQ%u, new: %*pb\n",
> +                   irq, nr_cpu_ids, &affinity);

I guess it's fine to have those without rate-limiting because
fixup_irqs is only called for admin-triggered actions, so there's no
risk of console flooding.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.