[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch V2 04/46] genirq/chip: Use the first chip in irq_chip_compose_msi_msg()



On Wed, 26 Aug 2020 12:16:32 +0100,
Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> 
> The documentation of irq_chip_compose_msi_msg() claims that with
> hierarchical irq domains the first chip in the hierarchy which has an
> irq_compose_msi_msg() callback is chosen. But the code just keeps
> iterating after it finds a chip with a compose callback.
> 
> The x86 HPET MSI implementation relies on that behaviour, but that does not
> make it more correct.
> 
> The message should always be composed at the domain which manages the
> underlying resource (e.g. APIC or remap table) because that domain knows
> about the required layout of the message.
> 
> On X86 the following hierarchies exist:
> 
> 1)   vector -------- PCI/MSI
> 2)   vector -- IR -- PCI/MSI
> 
> The vector domain has a different message format than the IR (remapping)
> domain. So obviously the PCI/MSI domain can't compose the message without
> having knowledge about the parent domain, which is exactly the opposite of
> what hierarchical domains want to achieve.
> 
> X86 actually has two different PCI/MSI chips where #1 has a compose
> callback and #2 does not. #2 delegates the composition to the remap domain
> where it belongs, but #1 does it at the PCI/MSI level.
> 
> For the upcoming device MSI support it's necessary to change this and just
> let the first domain which can compose the message take care of it. That
> way the top level chip does not have to worry about it and the device MSI
> code does not need special knowledge about topologies. It just sets the
> compose callback to NULL and lets the hierarchy pick the first chip which
> has one.
> 
> Due to that the attempt to move the compose callback from the direct
> delivery PCI/MSI domain to the vector domain made the system fail to boot
> with interrupt remapping enabled because in the remapping case
> irq_chip_compose_msi_msg() keeps iterating and choses the compose callback
> of the vector domain which obviously creates the wrong format for the remap
> table.
> 
> Break out of the loop when the first irq chip with a compose callback is
> found and fixup the HPET code temporarily. That workaround will be removed
> once the direct delivery compose callback is moved to the place where it
> belongs in the vector domain.
> 
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> ---
> V2: New patch. Note, that this might break other stuff which relies on the
>     current behaviour, but the hierarchy composition of DT based chips is
>     really hard to follow.

Grepping around, I don't think there is any occurrence of two irqchips
providing irq_compose_msi() that can share a hierarchy on any real
system, so we should be fine. Famous last words.

> ---
>  arch/x86/kernel/apic/msi.c |    7 +++++--
>  kernel/irq/chip.c          |   12 +++++++++---
>  2 files changed, 14 insertions(+), 5 deletions(-)
> 
> --- a/arch/x86/kernel/apic/msi.c
> +++ b/arch/x86/kernel/apic/msi.c
> @@ -479,10 +479,13 @@ struct irq_domain *hpet_create_irq_domai
>       info.type = X86_IRQ_ALLOC_TYPE_HPET;
>       info.hpet_id = hpet_id;
>       parent = irq_remapping_get_ir_irq_domain(&info);
> -     if (parent == NULL)
> +     if (parent == NULL) {
>               parent = x86_vector_domain;
> -     else
> +     } else {
>               hpet_msi_controller.name = "IR-HPET-MSI";
> +             /* Temporary fix: Will go away */
> +             hpet_msi_controller.irq_compose_msi_msg = NULL;
> +     }
>  
>       fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name,
>                                             hpet_id);
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -1544,10 +1544,16 @@ int irq_chip_compose_msi_msg(struct irq_
>       struct irq_data *pos = NULL;
>  
>  #ifdef       CONFIG_IRQ_DOMAIN_HIERARCHY
> -     for (; data; data = data->parent_data)
> -#endif
> -             if (data->chip && data->chip->irq_compose_msi_msg)
> +     for (; data; data = data->parent_data) {
> +             if (data->chip && data->chip->irq_compose_msi_msg) {
>                       pos = data;
> +                     break;
> +             }
> +     }
> +#else
> +     if (data->chip && data->chip->irq_compose_msi_msg)
> +             pos = data;
> +#endif
>       if (!pos)
>               return -ENOSYS;
>  
> 
> 

Is it just me, or is this last change more complex than it ought to be?

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index 857f5f4c8098..25e18b73699c 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -1544,7 +1544,7 @@ int irq_chip_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
        struct irq_data *pos = NULL;
 
 #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
-       for (; data; data = data->parent_data)
+       for (; data && !pos; data = data->parent_data)
 #endif
                if (data->chip && data->chip->irq_compose_msi_msg)
                        pos = data;

Though the for loop in a #ifdef in admittedly an acquired taste...

Reviewed-by: Marc Zyngier <maz@xxxxxxxxxx>

        M.

-- 
Without deviation from the norm, progress is not possible.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.